首页 > 解决方案 > BigQuery 去重行 - 没有唯一的列

问题描述

我有一个 bigquery 表,它是多个左连接表的结果。由于左连接(笛卡尔积),结果重复

如何对行进行重复数据删除,以便我只看到一条记录?



SELECT T1.Col1,T1.Col2,........
T2.Col1,T2.Col2,........
T3.Col1,T3.Col3,........
T5.Col1,T5.Col2,........
T7.Col1.......


FROM `TABLE1` as T1
LEFT JOIN 
`TABLE`as T2 ON T1.CUSTOMER_CODE = T2.CUSTOMER_CODE
LEFT JOIN 
`TABLE3` as T3 ON (T1.MIAL_CODE) = T3.MIAL_CODE
LEFT JOIN
`TABLE5` as T5
ON T1.WORK_CODE = T5.WORK_CODE
LEFT JOIN 
`TABLE7` as T7
 ON T1.CA_DATE = T7.date


ORDER BY CA_DATE

标签: google-bigquery

解决方案


我使用了 GROUP BY,它可以很好地消除重复项

SELECT T1.Col1,T1.Col2,........
T2.Col1,T2.Col2,........
T3.Col1,T3.Col3,........
T5.Col1,T5.Col2,........
T7.Col1.......


FROM `TABLE1` as T1
LEFT JOIN 
`TABLE`as T2 ON T1.CUSTOMER_CODE = T2.CUSTOMER_CODE
LEFT JOIN 
`TABLE3` as T3 ON (T1.MIAL_CODE) = T3.MIAL_CODE
LEFT JOIN
`TABLE5` as T5
ON T1.WORK_CODE = T5.WORK_CODE
LEFT JOIN 
`TABLE7` as T7
 ON T1.CA_DATE = T7.date

GROUP BY T1.Col1,T1.Col2,........
T2.Col1,T2.Col2,........
T3.Col1,T3.Col3,........
T5.Col1,T5.Col2,........
T7.Col1.......

ORDER BY CA_DATE


推荐阅读