首页 > 解决方案 > 如何使用过分区

问题描述

我有这张桌子:

ID  BS  time
1   1   14:10:00
1   1   14:10:05
1   1   15:04:03
1   2   16:18:05
1   2   17:00:09
1   3   18:33:50
1   1   19:03:14
1   1   19:10:23

并且除了:

ID  BS  start_time  end_time
1   1   14:10:00    16:18:05
1   2   16:18:05    18:33:50
1   3   18:33:50    19:03:14
1   1   19:03:14    19:10:23

我尝试使用铅,但我不知道如何解决问题,当 BS 在结束后重复时

SELECT id,bs,time,--min(time) time_start,
lead(time,1) over (partition by id order by time) next_time,
FROM `sage-facet-114619.Temp_data.temp_table`
order by id,time

之后我会考虑分组,但我对同样的 BS 有问题

标签: sqlgoogle-bigquery

解决方案


下面是 BigQuery 标准 SQL (实际上返回预期结果 - 这不是其他两个答案的情况)

#standardSQL
SELECT id, bs, 
  MIN(time) AS start_time,
  MAX(IFNULL(end_time, time)) AS end_time
FROM (
  SELECT id, bs, time, end_time,
    COUNTIF(flag) OVER(PARTITION BY id ORDER BY time) AS grp
  FROM (
    SELECT *, 
      LEAD(time) OVER win AS end_time,
      bs != LAG(bs) OVER win AS flag
    FROM `sage-facet-114619.Temp_data.temp_table`
    WINDOW win AS (PARTITION BY id ORDER BY time)
  )
)
GROUP BY id, bs, grp   

如果应用于您问题的样本数据 - 输出是

Row id  bs  start_time  end_time     
1   1   1   14:10:00    16:18:05     
2   1   2   16:18:05    18:33:50     
3   1   3   18:33:50    19:03:14     
4   1   1   19:03:14    19:10:23     

推荐阅读