首页 > 解决方案 > 计算连续的 TRUE 值并在单独的列中显示为 TRUE 值的计数

问题描述

雪花表中的样本数据:

| EMAIL | Date |  EmailClicked|
| -------- | -------------- | -------- |
| abc@gmail.com| 01-01-2021| TRUE |
| abc@gmail.com| 02-01-2021| FALSE|
| def@gmail.com| 02-01-2021| FALSE|
| abc@gmail.com| 03-01-2021| TRUE |
| abc@gmail.com| 04-01-2021| TRUE |
| abc@gmail.com| 05-01-2021| TRUE |
| def@gmail.com| 03-01-2021| TRUE|
| def@gmail.com| 06-01-2021| FALSE|
| abc@gmail.com| 06-01-2021| FALSE|
| abc@gmail.com| 07-01-2021| TRUE |
| abc@gmail.com| 08-01-2021| TRUE |

输出预期将连续的真值显示为另一列

| EMAIL | Date |  EmailClicked| Consecutivecount |
| -------- | -------------- | -------- | -------- |
| abc@gmail.com| 01-01-2021| TRUE | 1 |
| abc@gmail.com| 02-01-2021| FALSE| 0 |
| abc@gmail.com| 03-01-2021| TRUE | 1 |
| abc@gmail.com| 04-01-2021| TRUE | 2 |
| abc@gmail.com| 05-01-2021| TRUE | 3 |
| abc@gmail.com| 06-01-2021| FALSE| 0 |
| abc@gmail.com| 07-01-2021| TRUE | 1 |
| abc@gmail.com| 08-01-2021| TRUE | 2 |
| def@gmail.com| 02-01-2021| FALSE| 0 |
| def@gmail.com| 03-01-2021| TRUE| 1 |
| def@gmail.com| 06-01-2021| FALSE| 0 |

标签: sqlsnowflake-cloud-data-platform

解决方案


这是一个间隙和孤岛问题,一种方法使用行数差异方法:

WITH cte AS (
    SELECT *, ROW_NUMBER() OVER (ORDER BY Date) rn1,
              ROW_NUMBER() OVER (PARTITION BY EMAIL, EmailClicked ORDER BY Date) rn2
    FROM yourTable
)

SELECT EMAIL, Date, EmailClicked,
       EmailClicked::int * ROW_NUMBER() OVER (PARTITION BY EMAIL, EmailClicked, rn1-rn2
                                    ORDER BY Date) AS Consecutivecount
FROM cte
ORDER BY EMAIL, Date;

下面演示链接的屏幕截图

演示


推荐阅读