google-bigquery - BigQuery 去重行 - 没有唯一的列
问题描述
我有一个 bigquery 表,它是多个左连接表的结果。由于左连接(笛卡尔积),结果重复
如何对行进行重复数据删除,以便我只看到一条记录?
SELECT T1.Col1,T1.Col2,........
T2.Col1,T2.Col2,........
T3.Col1,T3.Col3,........
T5.Col1,T5.Col2,........
T7.Col1.......
FROM `TABLE1` as T1
LEFT JOIN
`TABLE`as T2 ON T1.CUSTOMER_CODE = T2.CUSTOMER_CODE
LEFT JOIN
`TABLE3` as T3 ON (T1.MIAL_CODE) = T3.MIAL_CODE
LEFT JOIN
`TABLE5` as T5
ON T1.WORK_CODE = T5.WORK_CODE
LEFT JOIN
`TABLE7` as T7
ON T1.CA_DATE = T7.date
ORDER BY CA_DATE
解决方案
我使用了 GROUP BY,它可以很好地消除重复项
SELECT T1.Col1,T1.Col2,........
T2.Col1,T2.Col2,........
T3.Col1,T3.Col3,........
T5.Col1,T5.Col2,........
T7.Col1.......
FROM `TABLE1` as T1
LEFT JOIN
`TABLE`as T2 ON T1.CUSTOMER_CODE = T2.CUSTOMER_CODE
LEFT JOIN
`TABLE3` as T3 ON (T1.MIAL_CODE) = T3.MIAL_CODE
LEFT JOIN
`TABLE5` as T5
ON T1.WORK_CODE = T5.WORK_CODE
LEFT JOIN
`TABLE7` as T7
ON T1.CA_DATE = T7.date
GROUP BY T1.Col1,T1.Col2,........
T2.Col1,T2.Col2,........
T3.Col1,T3.Col3,........
T5.Col1,T5.Col2,........
T7.Col1.......
ORDER BY CA_DATE
推荐阅读
- json - org.codehaus.jackson.JsonParseException: 意外字符 ('d' (code 100)): 期望用逗号分隔 OBJECT 条目\n
- python - 如何在 Scrapy 中解析动态参数
- python - Jupyter 环境错误,加载的 tensorflow 版本与安装的不同
- rest - 使用 sipgate Rest API 发布语音邮件问候语
- python - 如何在特定时间之间获取记录?
- flutter - 减小 NavigationRailDestination Tiles 的大小
- angular - 使用德语语言环境的 MatDatepicker 无法正常工作。也许是错误?
- database-design - 如果表要满足 1NF,列值可以包含空格吗?
- python-3.x - 如何使用服务原则作为默认身份验证机制在 Azure DevOps CI 管道中运行 pytest(单元测试)
- php - Mysql 在发送快速请求时返回相同的行