首页 > 解决方案 > 非常慢的最小值分组查询

问题描述

以下查询:

select t.code, t.value, t.date
from readings t inner join (
    select code, min(date) as MinDate
    from readings
    where date >= "2018-11-1"
    group by code
) tm on t.code = tm.code and t.date = tm.MinDate

需要 +1 分钟才能做出回应。

子查询:

select code, min(date) as MinDate
from readings
where date >= "2018-11-1"
group by code

一秒钟后返回。另外,如果您进行整个查询并将 min() 替换为 max() 也只需一秒钟。

我正在使用 HeidiSQL,奇怪的是它报告:1,578 秒。(+ 61,172 秒网络)。但这没有任何意义,因为 max() 版本返回的数据量与数据量大致相同,并且只需要一秒钟。

当前指标:

PRIMARY             BTREE   Yes No  id  31096905    A   YES      
code_date_unique    BTREE   Yes No  code    0       A   YES      
                                    date    0       A   YES
    code            BTREE   No  No  code 15801      A   YES      
    date            BTREE   No  No  date 1943556    A   YES      

创建表:

CREATE TABLE `readings` (
`id` INT(15) NOT NULL AUTO_INCREMENT,
....
`code` VARCHAR(50) NULL DEFAULT NULL COLLATE 'latin1_spanish_ci',
`value` FLOAT(10,2) NULL DEFAULT NULL,
`date` DATETIME NULL DEFAULT NULL,
....
PRIMARY KEY (`id`),
UNIQUE INDEX `code_date_unique` (`code`, `date`),
INDEX `code` (`code`),
INDEX `code_DateSaved` (`code`, `dateSaved`),
INDEX `date` (`date`),
INDEX `datesaved` (`dateSaved`),
INDEX `origen` (`origen`)
)
COLLATE='utf8_spanish_ci'
ENGINE=MyISAM
AUTO_INCREMENT=584809517

解释:

explain select  t.code, t.value, t.date
from readings t
inner join (
    select code, min(date) as MinDate
    from readings
    where date >= "2018-11-1"
    group by code
) tm on t.code = tm.code and t.date = tm.MinDate 

+------+---------------+--------------+---------+---------------------------------------------+--------------------+-----------+--------------+---------+-----------------------------------------+
| "id" | "select_type" |   "table"    | "type"  |               "possible_keys"               |       "key"        | "key_len" |    "ref"     | "rows"  |                 "Extra"                 |
| "1"  | "PRIMARY"     | "<derived2>" | "ALL"   | \N                                          | \N                 | \N        | \N           | "15052" | ""                                      |
| "1"  | "PRIMARY"     | "t"          | "ref"   | "code_date_unique,code,code_DateSaved,date" | "date"             | "9"       | "tm.MinDate" | "16"    | "Using where"                           |
| "2"  | "DERIVED"     | "readings"   | "range" | "date"                                      | "code_date_unique" | "62"      | \N           | "10"    | "Using where; Using index for group-by" |

标签: mysql

解决方案


我不知道如何通过重组来使您的查询更快。但是我们可以尝试将以下索引添加到readings表中:

(code, date, value)
CREATE INDEX your_idx ON readings (code, date, value);

这个复合索引应该加快GROUP BY子查询的速度,使 MySQL 很容易找到每个组的最小日期(并且也使WHERE子句更容易在 之前进行限制GROUP BY)。

我编辑了答案以包含value在索引中,以便索引中的所有列SELECT都包含在内。


推荐阅读