首页 > 解决方案 > 使用 GROUP BY 获取适当的日期

问题描述

我有一张表格,可以在其中跟踪用户每天观看电影的时长。现在我想计算基于日期的唯一观看次数。

所以条件是:

但问题在于获得正确的日期行。例如row1.duration + row2.duration >= 120,因此应应用视图计数 = 12021-10-16

| id | user_id | duration | created_at | film_id |
+----+---------+----------+------------+---------+
| 1  | 1       | 80       | 2021-10-15 | 1       |
| 2  | 1       | 70       | 2021-10-16 | 1       |
| 3  | 1       | 200      | 2021-10-17 | 2       |
| 4  | 2       | 50       | 2021-10-18 | 1       |
| 5  | 2       | 90       | 2021-10-18 | 1       |
| 6  | 3       | 140      | 2021-10-18 | 2       |
| 7  | 4       | 10       | 2021-10-19 | 3       |

预期结果:

| cnt   | created_at |
+-------+------------+
| 0     | 2021-10-15 |
| 1     | 2021-10-16 |
| 0     | 2021-10-17 |
| 2     | 2021-10-18 |
| 0     | 2021-10-19 |

这是我尝试过的,但它选择了第一个日期,并忽略了 0 计数。这是填充数据的小提琴

SELECT count(*) AS cnt,
         created_at
FROM 
    (SELECT user_id,
         sum(duration) AS total,
         created_at
    FROM watch_time
    GROUP BY  user_id) AS t
WHERE t.total >= 120
GROUP BY  created_at;

是否有机会通过 SQL 完成这项工作,或者应该在应用程序级别完成?

提前致谢!

更新:

版本:AWS RDS MySQL 5.7.33

但如果有帮助的话,我可以切换到 Postgres。

非常感谢,即使有一种方法可以让 MIN(date) 包含所有日期(包括 0 个视图)。

比这个好。

SELECT IFNULL(cnt, 0) as cnt,
         t3.created_at
FROM 
    (SELECT count(*) AS cnt,
         created_at
    FROM 
        (SELECT user_id,
         sum(duration) AS total,
         created_at
        FROM watch_time
        GROUP BY  user_id) AS t
        WHERE t.total >= 120
        GROUP BY  created_at) AS t2
    RIGHT JOIN 
    (SELECT distinct(created_at)
    FROM watch_time) AS t3
    ON t2.created_at = t3.created_at;

返回:

| cnt   | created_at |
+-------+------------+
| 1     | 2021-10-15 |
| 0     | 2021-10-16 |
| 0     | 2021-10-17 |
| 2     | 2021-10-18 |
| 0     | 2021-10-19 |

但我不确定日期(2021-10-15)是随机取的还是它总是最低的日期

更新 2

是否也可以包含film_id?就像考虑 user_id 一样,film_id 是一个独特的视图,而不是仅按 user_id 分组。

所以在这种情况下:

row1&row2都有user_id: 1and film_id: 1,结果是 1 个视图,因为它们的总和durations is >= 120。所以在这种情况下的日期将是2021-10-16

但是row3user_id: 1film_id: 2, 并且duration >= 120它也是一个 1 带日期的视图2021-10-17

| id | user_id | duration | created_at | film_id |
+----+---------+----------+------------+---------+
| 1  | 1       | 80       | 2021-10-15 | 1       |
| 2  | 1       | 70       | 2021-10-16 | 1       |
| 3  | 1       | 200      | 2021-10-17 | 2       |
| 4  | 2       | 50       | 2021-10-18 | 1       |
| 5  | 2       | 90       | 2021-10-18 | 1       |
| 6  | 3       | 140      | 2021-10-18 | 2       |
| 7  | 4       | 10       | 2021-10-19 | 3       |

预期结果:

| cnt   | created_at |
+-------+------------+
| 0     | 2021-10-15 |
| 1     | 2021-10-16 |
| 1     | 2021-10-17 |
| 2     | 2021-10-18 |
| 0     | 2021-10-19 |

标签: mysqlsql

解决方案


使用 MySQL 变量,它可以实现你的计数逻辑,它基本上按照 user_id 和 created_at 对表的行进行排序,并逐行计算

http://sqlfiddle.com/#!9/569088/14

SELECT created_at, SUM(CASE WHEN duration >= 120 THEN 1 ELSE 0 END) counts
FROM (
  SELECT user_id, created_at, 
  CASE WHEN @UID != user_id THEN @SUM_TIME := 0 WHEN @SUM_TIME >= 120 AND @DT != created_at THEN @SUM_TIME := 0 - duration ELSE 0 END SX,
  @SUM_TIME := @SUM_TIME + duration AS duration,
  @UID := user_id,
  @DT := created_at
  FROM watch_time
  JOIN ( SELECT @SUM_TIME :=0, @DT := NOW(), @UID := '' ) t
  ORDER BY user_id, created_at
) f
GROUP BY created_at

推荐阅读