首页 > 解决方案 > BigQuery 分区表:指定月份的最短方法是什么?

问题描述

想知道查询分区表时指定月份的最短方法是什么。

TIMESTAMP_TRUNCto the month 看起来很诱人,但似乎不能用作分区过滤器:

SELECT SUM(views) views
FROM `fh-bigquery.wikipedia_v2.pageviews_2018`
WHERE TIMESTAMP_TRUNC(datehour, month) = '2018-04-01'

Cannot query over table 'fh-bigquery.wikipedia_v2.pageviews_2018' without a filter that can be used for partition elimination

BETWEEN日期看起来也很诱人,但需要知道每个月有多少天:

SELECT SUM(views) views
FROM `fh-bigquery.wikipedia_v2.pageviews_2018`
WHERE DATE(datehour) BETWEEN '2018-04-01' AND '2018-04-31'

Could not cast literal "2018-04-31" to type DATE at [3:47]

DATE_SUB(DATE_ADD(month), day有效,但需要输入两次日期并且输入时间很长:

SELECT SUM(views) views
FROM `fh-bigquery.wikipedia_v2.pageviews_2018`
WHERE DATE(datehour) 
  BETWEEN '2018-04-01' 
  AND DATE_SUB(DATE_ADD('2018-04-01', INTERVAL 1 MONTH), INTERVAL 1 DAY) 

15746003449

您将如何改进这一点?

标签: sqlgoogle-bigqueryuser-defined-functionsbigquery-standard-sql

解决方案


我会这样做:

SELECT SUM(views) as views
FROM `fh-bigquery.wikipedia_v2.pageviews_2018`
WHERE hour >= date '2018-04-01' AND hour < date '2018-05-01';

您可以将日期常量放在 CTE 中:

with params as (
      select date '2018-04-01' as dte
     )
select sum(views) as views
from params cross join
     `fh-bigquery.wikipedia_v2.pageviews_2018`
where hour >= params.dte AND hour < date_add(params.dte, interval 1 month)

推荐阅读