首页 > 解决方案 > 在 Google BigQuery 中按用户和月份分组

问题描述

我对 BigQuery 很陌生,我编写了以下查询:

select
email,
min(address.zip) as zip,
min(country) as country,
min(source),
min(created_at) as first_order,
max(created_at)as last_order,
count(order_number) as number_orders,
sum(total_price) as total_spent
from orders_data
group by email

它给我的是一个表格,其中所有订单都按客户分组,输出如下:

email | zip | country | source | first_order | last_order | number_orders | total_spent |

我想补充的是另一个按月分组。我希望每月有每个客户(电子邮件)的 total_spend 和 number_orders。

根据更容易的情况,结果可能如下所示:

email | zip | country | source | first_order | last_order | number_orders | total_spent | 2017_January_spent | 2017_January_num_orders | 2017_February_spent | 2017_February_num_orders | ... | 

或按行分组,如下所示:

 month  | email | zip | country | source | first_order | last_order | number_orders | total_spent | 2017_January_spent | 2017_January_num_orders | 2017_February_spent | 2017_February_num orders | ... | 
 01.2017| cust_A|
 02.2017| cust_A|
 03.2017| cust_A|

不确定它是否有帮助,但我的数据范围从 2017 年到 2020 年,我的时间戳变量“ created_at ”如下所示:2020-01-02 16:20:12 UTC

我已经通过添加尝试过group by moth(created_at),也玩过timestamp_trunc但失败了。

帮助将不胜感激!

非常感谢!

标签: sqldategroup-bygoogle-bigquery

解决方案


使用date_trunc()或 适合您的列类型的函数:

select timestamp_trunc(created_at, month) as mon, email,
       min(address.zip) as zip, min(country) as country, min(source),
       min(created_at) as first_order, max(created_at)as last_order,
       count(order_number) as number_orders, sum(total_price) as total_spent
from orders_data
group by mon, email
order by mon, email;

推荐阅读