首页 > 解决方案 > 日期之间的左连接匹配

问题描述

我正在尝试将国定假日的日期与时间段边界配对。我想离开加入反对buckets。这样我就可以确定假期是否在某个时间范围内has_holiday

使用以下查询,我收到一个查询错误:

Query error: LEFT OUTER JOIN cannot be used without a condition that is an equality of fields from both sides of the join. at [6:1]

解决这个问题的最佳方法是什么?

DECLARE now_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP();
DECLARE end_date TIMESTAMP DEFAULT TIMESTAMP_SUB(now_date, INTERVAL 365 DAY);
DECLARE block ARRAY<TIMESTAMP> DEFAULT GENERATE_TIMESTAMP_ARRAY(end_date, now_date, INTERVAL 7 DAY);
---------------------------------------------------------------------

WITH buckets AS(
SELECT
  TIMESTAMP_SUB(bucket_end, INTERVAL 7 DAY) AS bucket_start,
  bucket_end
FROM
  UNNEST(block) AS bucket_end
)

SELECT
  bucket_start,
  bucket_end,
  IF(holiday_name IS NOT NULL, 1, 0) AS has_holiday
FROM
  buckets AS b 
LEFT JOIN
  `external_landing.tbl_public_holidays_2020` AS ex
ON
  ex.date BETWEEN bucket_start AND bucket_end

表结构external_landing.tbl_public_holidays_2020

|    | day       | date                | holiday_name                    | type                 |
|---:|:----------|:--------------------|:--------------------------------|:---------------------|
|  0 | Tuesday   | 2019-01-01 00:00:00 | New Year's Day                  | National Holiday     |
|  1 | Wednesday | 2019-01-02 00:00:00 | Day after New Year's Day        | National Holiday     |
|  2 | Monday    | 2019-01-21 00:00:00 | Wellington Anniversary Day      | Regional Holiday     |
|  3 | Monday    | 2019-01-28 00:00:00 | Auckland Anniversary Day        | Regional Holiday     |
|  4 | Monday    | 2019-02-04 00:00:00 | Nelson Anniversary Day          | Regional Holiday     |
|  5 | Wednesday | 2019-02-06 00:00:00 | Waitangi Day                    | National Holiday     |
|  6 | Monday    | 2019-03-11 00:00:00 | Taranaki Anniversary Day        | Regional Holiday     |
|  7 | Monday    | 2019-03-25 00:00:00 | Otago Anniversary Day           | Regional Holiday     |
|  8 | Friday    | 2019-04-19 00:00:00 | Good Friday                     | National Holiday     |
|  9 | Monday    | 2019-04-22 00:00:00 | Easter Monday                   | National Holiday     |
| 10 | Tuesday   | 2019-04-23 00:00:00 | Southland Anniversary Day       | Regional Holiday     |
| 11 | Thursday  | 2019-04-25 00:00:00 | Anzac Day                       | National Holiday     |
| 12 | Sunday    | 2019-05-12 00:00:00 | Mother's Day                    | Not A Public Holiday |
| 13 | Monday    | 2019-06-03 00:00:00 | Queen's Birthday                | National Holiday     |
| 14 | Sunday    | 2019-09-01 00:00:00 | Father's Day                    | Not A Public Holiday |
| 15 | Monday    | 2019-09-23 00:00:00 | South Canterbury Day            | Regional Holiday     |
| 16 | Friday    | 2019-10-25 00:00:00 | Hawke's Bay Anniversary Day     | Regional Holiday     |
| 17 | Monday    | 2019-10-28 00:00:00 | Labour Day                      | National Holiday     |
| 18 | Monday    | 2019-11-04 00:00:00 | Marlborough Anniversary Day     | Regional Holiday     |
| 19 | Friday    | 2019-11-15 00:00:00 | Christchurch Show Day           | Regional Holiday     |
| 20 | Monday    | 2019-12-02 00:00:00 | Chatham Islands Anniversary Day | Regional Holiday     |
| 21 | Monday    | 2019-12-02 00:00:00 | Westland Anniversary Day        | Regional Holiday     |
| 22 | Wednesday | 2019-12-25 00:00:00 | Christmas Day                   | National Holiday     |
| 23 | Thursday  | 2019-12-26 00:00:00 | Boxing Day                      | National Holiday     |

标签: sqlgoogle-bigquery

解决方案


在下面尝试(BigQuery 标准 SQL)

DECLARE now_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP();
DECLARE end_date TIMESTAMP DEFAULT TIMESTAMP_SUB(now_date, INTERVAL 365 DAY);
DECLARE block ARRAY<TIMESTAMP> DEFAULT GENERATE_TIMESTAMP_ARRAY(end_date, now_date, INTERVAL 7 DAY);
---------------------------------------------------------------------

WITH buckets AS(
  SELECT TIMESTAMP_SUB(bucket_end, INTERVAL 7 DAY) AS bucket_start, bucket_end
  FROM UNNEST(block) AS bucket_end
)
SELECT
  bucket_start,
  bucket_end,
  IF(holiday_name IS NOT NULL, 1, 0) AS has_holiday
FROM buckets AS b 
CROSS JOIN `external_landing.tbl_public_holidays_2020` AS ex
WHERE ex.date BETWEEN bucket_start AND bucket_end     

同时,我建议使用以下版本

DECLARE now_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP();
DECLARE end_date TIMESTAMP DEFAULT TIMESTAMP_SUB(now_date, INTERVAL 365 DAY);
DECLARE block ARRAY<TIMESTAMP> DEFAULT GENERATE_TIMESTAMP_ARRAY(end_date, now_date, INTERVAL 7 DAY);
---------------------------------------------------------------------

WITH buckets AS(
  SELECT TIMESTAMP_SUB(bucket_end, INTERVAL 7 DAY) AS bucket_start, bucket_end
  FROM UNNEST(block) AS bucket_end
)
SELECT
  bucket_start,
  bucket_end,
  COUNTIF(holiday_name IS NOT NULL AND ex.date BETWEEN bucket_start AND bucket_end) AS holidays
FROM buckets AS b 
CROSS JOIN `external_landing.tbl_public_holidays_2020` AS ex
GROUP BY bucket_start, bucket_end   

这将输出所有包含假期数的存储桶
如果您想排除没有假期的存储桶 - 只需在下面添加到末尾

HAVING holidays > 0   

推荐阅读