首页 > 解决方案 > mysql多个左连接并按主表分组

问题描述

我有以下情况。一个地区有多个地区,一个地区有多个地址,一个地址在一个月内被访问多次。现在我想生成关于某个区域的月度报告。(一个区域被访问了多少次)。我已经编写了查询,但结果集产生的区域较少,因为有些地址没有被访问。我有以下结构

区域:id|name(180行)//名称是唯一的

地区:id|name|area_id(1k 行)

地址:id|name|territory_id(80k 行)

访问地址:id|address_id|date|status(1M+ 行)//状态 => 1 = 已访问,2 = 待处理

我的查询如下。

select ar.id as area_id, ar.name as area,
sum(case when va.status = 1 then 1 else 0 end) as visited,
sum(case when va.status = 2 then 1 else 0 end) as pending,
count(va.id) as total

from      areas ar
left join territories t on t.area_id=ar.id
left join addresses a on a.territory_id=t.id
left join visiting_addresses va on va.address_id=a.id
where     month(va.date) = '01'
and       year(va.date)='2020'
group by  ar.id

区域表包含 180 个区域,但结果集仅显示 144 个区域。我的错误在哪里,对此有何解释?这些地区失踪了,因为他们没有来访。

标签: mysqlsqlleft-join

解决方案


您的 WHERE 子句正在将 LEFT JOIN 转换visiting_addresses为 INNER JOIN。由于它是 LEFT-JOIN 链中最右边的表,所有连接都将转换为 INNER JOINS。为了防止这种情况,您应该将相应的条件从 WHERE 子句移到 ON 子句:

select ar.id as area_id, ar.name as area,
sum(case when va.status = 1 then 1 else 0 end) as visited,
sum(case when va.status = 2 then 1 else 0 end) as pending,
count(va.id) as total

from      areas ar
left join territories t on t.area_id=ar.id
left join addresses a on a.territory_id=t.id
left join visiting_addresses va
  on  va.address_id=a.id
  and month(va.date) = '01'
  and year(va.date)='2020'

group by  ar.id

但是由于您有很多行,我宁愿运行两个查询。首先使用内部连接仅获取上个月具有地址的区域。您应该更改条件va.date以使用索引:

select ar.id as area_id, ar.name as area,
sum(case when va.status = 1 then 1 else 0 end) as visited,
sum(case when va.status = 2 then 1 else 0 end) as pending,
count(va.id) as total

from areas ar
join territories t on t.area_id=ar.id
join addresses a on a.territory_id=t.id
join visiting_addresses va on  va.address_id=a.id
where va.date >= '2020-01-01'
  and va.date <  '2020-02-01'

group by  ar.id

确保你有一个索引,visiting_addresses(date)甚至更好visiting_addresses(date, address_id, status)

然后用一个简单的方法获取所有区域

select ar.id as area_id, ar.name as area 
from areas ar

并在设置时将缺失区域添加到第一个结果中visitedpending并添加total到零(在应用程序代码中)。

INNER JOIN 应该快得多,因为现在引擎可以开始从visiting_addresses使用 WHERE 条件的索引中仅读取必要的行。

您还可以使用更复杂但单一的查询。想法是使用带有预聚合子查询的 LEFT JOIN:

select ar.id as area_id, ar.name as area,
    coalesce(visited, 0) as visited,
    coalesce(pending, 0) as pending,
    coalesce(total, 0) as total
from areas ar
left join (
    select t.area_id
    sum(case when va.status = 1 then 1 else 0 end) as visited,
    sum(case when va.status = 2 then 1 else 0 end) as pending,
    count(va.id) as total
    from territories t
    join addresses a on a.territory_id=t.id
    join visiting_addresses va on  va.address_id=a.id
    where va.date >= '2020-01-01'
      and va.date <  '2020-02-01'
    group by t.area_id
) x on x.area_id = ar.id

推荐阅读