首页 > 解决方案 > 计算事件之间的系统状态

问题描述

这是控制机器(或)状态的命令(ON或)的时间线(从时间戳 1 到 66 )。OFFONLNIEOFFLINE

Timeline  ------------|-------------|--------|------|---------|------------>

Command               ON            OFF      OFF    ON        OFF

请注意,第二个OFF命令是无用的。

因此,机器状态应该是:

State     ---OFFLINE--|----ONLINE---|----OFFLINE----|--ONLINE-|--OFFLINE----

请注意,默认状态是OFFLINE

我有 100,000 台机器的以下数据:

Machine Timestamp    Command
======= =========    =======
1       13           ON
1       27           OFF 
1       36           OFF
1       43           ON
1       53           OFF
...
n       ...          ...

我想创建下表:

Machine   From    To    State
=======   ====    ==    =====
1         1       13    OFFLINE 
1         13      27    ONLINE
1         27      43    OFFLINE
1         43      53    ONLINE
1         53      66    OFFLINE
...
n         ...     ...   ...

From具有包容性,并且To是排他性的,感谢@MatBailie)

在 SQL (DB2) 或 R 中执行此操作的有效方法是什么?

标签: sqlrtimestampdb2time-series

解决方案


一个 SQL 答案...

http://sqlfiddle.com/#!18/5606d/2 # QUERY 语法在 DB2 中应该相同

SELECT
  machine_id,
  MAX(       command)    AS state,
  MIN(     timestamp)    AS start_timestamp,
  LEAD(MIN(timestamp))
    OVER (PARTITION BY machine_id
              ORDER BY GroupID
         )
                         AS cease_timestamp  
FROM
(
  SELECT
    *,
    SUM(CASE WHEN prev_command = command THEN 0 ELSE 1 END)
      OVER (PARTITION BY machine_id
                ORDER BY timestamp
           )
        AS GroupID
  FROM
  (
    SELECT
      *,
      LAG(command)
        OVER (PARTITION BY machine_id
                  ORDER BY timestamp
             )
               AS prev_command
    FROM
    (
      SELECT machine_id, timestamp, command FROM machine_events
      UNION ALL
      SELECT machine_id,         1,   'OFF' FROM machine_events GROUP BY machine_id HAVING MIN(timestamp) > 1
    )
      expanded_events
  )
    look_back
)
  grouped
GROUP BY
  machine_id,
  GroupID
ORDER BY
  machine_id,
  GroupID

推荐阅读