sql - 如何防止左连接返回比未连接查询更多的记录？

问题描述

我正在尝试使用 Left Join 连接多个表，以便根据主表（table_a）的 id 字段仅获取找到的记录：

Select table_a.id, table_b.location, table_c.material
From table_a
Left Join table_b
On table_a.id = table_b.id
Left Join table_c
On table_a.id = table_c.id

这里一切似乎都很好，我在输出中得到了预期的字段，而且记录数为 11,000（与 table_a 相同）

但是，当我在查询中添加下一个左连接时，使用 tabl_b 的 id 字段而不是基于 table:a 的 id 字段，我得到 11,500 条记录：

Select table_a.id, table_b.location, table_c.material, table_d.sales
From table_a
Left Join table_b
On table_a.id = table_b.id
Left Join table_c
On table_a.id = table_c.id
Left Join table_d.id 
On table_b.id = table_d.id

你知道我可以如何防止这个问题吗？

标签： sqlsubquery

table_d 与 table_b 有超过 1 个匹配是有原因的。在这里，重要的是要考虑问题的业务规则。通常我们不能简单地忽略多个结果，要么我们需要对额外的列进行分组，求和，平均，要么根据一定的规则选择多个匹配项之一。例如，这里我假设从 table_d 我想要最近的记录匹配，即月份列。我使用等级和分区来获取为其订购“重复”的 ID，在这种情况下，我只想要第一个匹配项（order_c = 1）：

WITH cte AS (
  Select table_a.id, table_b.location, table_c.material, table_d.sales
, RANK() OVER (partition by table_a.id order by table_d.month desc) as order_c
  From table_a
Left Join table_b
On table_a.id = table_b.id
Left Join table_c
On table_a.id = table_c.id
Left Join table_d
On table_b.id = table_d.id)

select id, location, material, sales, order_c 
from cte where order_c =1

你可以看到小提琴在起作用。

create table table_a (ID INT);
create table table_b (ID INT, location varchar(10));
create table table_c (ID INT, material varchar(10));
create table table_d (ID INT, sales INT, month INT);
INSERT into table_a(ID) 
VALUES (1), (2), (3), (4), (5);
INSERT into table_b(ID, location) 
VALUES (1, 'UK'),
      (9, 'USA');
INSERT into table_c(ID, material) 
VALUES (1, 'paper');
INSERT into table_d(ID, sales, month) 
VALUES (1, 345, 1), (1, 599, 2);

sql - 如何防止左连接返回比未连接查询更多的记录？

问题描述

解决方案

推荐阅读