首页 > 解决方案 > 在 bigquery 中加入具有相同架构的两个表

问题描述

我有两个与输入具有相同架构的表:

+---------+--------+----------------------+
|  value  |  city  |   timestamp          |
+---------+--------+----------------------+
| 50      |  LA    |  2019-02-6 03:05pm   |
| 163     |  NYC   |  2019-02-5 03:06pm   |
| 681     |  SF    |  2019-02-4 06:41pm   |
| 35      |  LA    |  2019-02-3 05:12pm   |
+---------+--------+----------------------+

第一个表包含定期和第二个费用支付。我想加入表格并将它们分组如下:

+------------+--------+----------+--------------+
|  regular   |  fees  |   city   |  timestamp   |
+------------+--------+----------+--------------+
| 50         | 20     | LA       |  2019-02-6   |
| 163        | NULL   | NYC      |  2019-02-5   |
| 681        | ..     | SF       |  2019-02-4   |
| 35         | ..     | LA       |  2019-02-3   |
+------------+--------+----------+--------------+

可能有几天不收取任何费用。我试过的:

SELECT t1.city, regular, fees, t1.day
FROM
(
  SELECT city, SUM(value) AS regular, FORMAT_TIMESTAMP("%Y-%m-%d", TIMESTAMP(timestamp)) as day
  FROM `payments`
  GROUP BY day, city
) t1
FULL JOIN (
  SELECT city, SUM(value) AS fees, FORMAT_TIMESTAMP("%Y-%m-%d", TIMESTAMP(timestamp)) as day
  FROM `fees`
  GROUP BY day, city
) t2
ON t1.day = t2.day
ORDER BY t1.day DESC

这会产生正确的输出模式,但是它不会正确地计算费用:

+------------+--------+----------+--------------+
|  regular   |  fees  |   city   |  timestamp   |
+------------+--------+----------+--------------+
| 26500      | 6300   | LA       |  2019-02-6   |
| 26500      | 8500   | LA       |  2019-02-6   |
| 26500      | 1000   | LA       |  2019-02-6   |
+------------+--------+----------+--------------+

如您所见,我在同一天和城市获得不同的费用值。有什么想法我在这里做错了吗?

标签: google-bigquery

解决方案


问题只是在你的 ON 子句中 - 你只在几天内加入,但应该加入几天和城市,如下面的片段

ON t1.day = t2.day
AND t1.city = t2.city 

推荐阅读