首页 > 解决方案 > 在 bigquery 中查询负账户余额期间

问题描述

我正在玩 bigquery 并遇到了一个有趣的用例。我有一组客户和帐户余额。账户余额集合记录任何账户余额变化。

顾客:

+---------+--------+
|    ID   |  Name  |
+---------+--------+
| 1       | Alice  |
| 2       | Bob    |
+---------+--------+

账户余额:

+---------+---------------+---------+------------+
|    ID   |  customer_id  |  value  | timestamp  |
+---------+---------------+---------+------------+
| 1       | 1             |  -500   | 2019-02-12 |
| 2       | 1             |  -200   | 2019-02-10 |
| 3       | 2             |  200    | 2019-02-10 |
| 4       | 1             |  0      | 2019-02-09 |
+---------+---------------+---------+------------+

目标是找出客户账户余额为负数的时间。生成的集合如下所示:

+---------+--------+---------------------------------+
|    ID   |  Name  |  Negative account balance since |
+---------+--------+---------------------------------+
| 1       | Alice  |  2 days                         |
+---------+--------+---------------------------------+

Bob 不在集合中,因为他的最后一个帐户记录显示为正值。

我认为涉及以下步骤:

这样的事情在sql中甚至可能吗?您对谁创建此类查询有任何想法吗?要获取当前帐户余额为负数的客户,我使用以下查询:

SELECT customer_id FROM (
  SELECT t.account_balance, ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY timestamp DESC) as seqnum FROM `account_balances` t
) t
WHERE seqnum = 1 AND account_balance<0

标签: google-bigquery

解决方案


以下是 BigQuery 标准 SQL

#standardSQL
SELECT customer_id, name,
  SUM(IF(negative_positive < 0, days, 0)) negative_days,
  SUM(IF(negative_positive = 0, days, 0)) zero_days,
  SUM(IF(negative_positive > 0, days, 0)) positive_days
FROM (
  SELECT customer_id, negative_positive, grp,
    1 + DATE_DIFF(MAX(ts), MIN(ts), DAY) days
  FROM (
    SELECT customer_id, ts, SIGN(value) negative_positive, 
    COUNTIF(flag) OVER(PARTITION BY customer_id ORDER BY ts) grp
    FROM (
      SELECT *, SIGN(value) = IFNULL(LEAD(SIGN(value)) OVER(PARTITION BY customer_id ORDER BY ts), 0) flag
      FROM `project.dataset.balances`
    )
  )
  GROUP BY customer_id, negative_positive, grp
)
LEFT JOIN `project.dataset.customers`
ON id = customer_id
GROUP BY customer_id, name

您可以使用您问题中的示例数据进行测试,使用上面的示例数据,如下例所示

#standardSQL
WITH `project.dataset.balances` AS (
  SELECT 1 customer_id, -500 value, DATE '2019-02-12' ts UNION ALL
  SELECT 1, -200, '2019-02-10' UNION ALL
  SELECT 2, 200, '2019-02-10' UNION ALL
  SELECT 1, 0, '2019-02-09' 
), `project.dataset.customers` AS (
  SELECT 1 id, 'Alice' name UNION ALL
  SELECT 2, 'Bob' 
)
SELECT customer_id, name,
  SUM(IF(negative_positive < 0, days, 0)) negative_days,
  SUM(IF(negative_positive = 0, days, 0)) zero_days,
  SUM(IF(negative_positive > 0, days, 0)) positive_days
FROM (
  SELECT customer_id, negative_positive, grp,
    1 + DATE_DIFF(MAX(ts), MIN(ts), DAY) days
  FROM (
    SELECT customer_id, ts, SIGN(value) negative_positive, 
    COUNTIF(flag) OVER(PARTITION BY customer_id ORDER BY ts) grp
    FROM (
      SELECT *, SIGN(value) = IFNULL(LEAD(SIGN(value)) OVER(PARTITION BY customer_id ORDER BY ts), 0) flag
      FROM `project.dataset.balances`
    )
  )
  GROUP BY customer_id, negative_positive, grp
)
LEFT JOIN `project.dataset.customers`
ON id = customer_id
GROUP BY customer_id, name
-- ORDER BY customer_id

结果

Row customer_id name    negative_days   zero_days   positive_days    
1   1           Alice   3               1           0    
2   2           Bob     0               0           1    

推荐阅读