mysql - SELECT 语句优化 MySQL
问题描述
我正在寻找一种方法如何使我的 SELECT 查询比现在更快,因为我觉得应该可以让它更快。
这是查询
SELECT r.id_customer, ROUND(AVG(tp.percentile_weighted), 2) AS percentile
FROM tag_rating AS r USE INDEX (value_date_add)
JOIN tag_product AS tp ON (tp.id_pair = r.id_pair)
WHERE
r.value = 1 AND
r.date_add > '2020-08-08 11:56:00'
GROUP BY r.id_customer
这是解释选择
+----+-------------+-------+--------+----------------+----------------+---------+---------------+--------+---------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------+----------------+---------+---------------+--------+---------------------------------------------------------------------+
| 1 | SIMPLE | r | ref | value_date_add | value_date_add | 1 | const | 449502 | Using index condition; Using where; Using temporary; Using filesort |
+----+-------------+-------+--------+----------------+----------------+---------+---------------+--------+---------------------------------------------------------------------+
| 1 | SIMPLE | tp | eq_ref | PRIMARY | PRIMARY | 4 | dev.r.id_pair | 1 | |
+----+-------------+-------+--------+----------------+----------------+---------+---------------+--------+---------------------------------------------------------------------+
现在桌子是
CREATE TABLE `tag_product` (
`id_pair` int(10) unsigned NOT NULL AUTO_INCREMENT,
`id_product` int(10) unsigned NOT NULL,
`id_user_tag` int(10) unsigned NOT NULL,
`status` tinyint(3) NOT NULL,
`percentile` decimal(8,4) unsigned NOT NULL,
`percentile_weighted` decimal(8,4) unsigned NOT NULL,
`elo` int(10) unsigned NOT NULL,
`date_add` datetime NOT NULL,
`date_upd` datetime NOT NULL,
PRIMARY KEY (`id_pair`),
UNIQUE KEY `id_product_id_user_tag` (`id_product`,`id_user_tag`),
KEY `status` (`status`),
KEY `id_user_tag` (`id_user_tag`),
CONSTRAINT `tag_product_ibfk_5` FOREIGN KEY (`id_user_tag`) REFERENCES `user_tag` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
CREATE TABLE `tag_rating` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`id_customer` int(10) unsigned NOT NULL,
`id_pair` int(10) unsigned NOT NULL,
`id_duel` int(10) unsigned NOT NULL,
`value` tinyint(4) NOT NULL,
`date_add` datetime NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `id_duel_id_pair` (`id_duel`,`id_pair`),
KEY `id_pair_id_customer` (`id_pair`,`id_customer`),
KEY `value` (`value`),
KEY `value_date_add` (`value`,`date_add`),
KEY `id_customer_value_date_add` (`id_customer`,`value`,`date_add`),
CONSTRAINT `tag_rating_ibfk_3` FOREIGN KEY (`id_pair`) REFERENCES `tag_product` (`id_pair`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `tag_rating_ibfk_6` FOREIGN KEY (`id_duel`) REFERENCES `tag_rating_duel` (`id_duel`) ON DELETE CASCADE ON UPDATE CASCADE,
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
表 tag_product 有大约 250k 行,而 tag_rating 有大约 1M 行。
我的问题是 SQL 查询在我的机器上平均需要大约 0.8 秒。我希望将其理想地控制在 0.5 秒以下,同时还假设表格可以变大 10 倍。由于我有一个日期条件(我只想要不到一个月大的行),所以使用的行数应该大致相同。
这是否可以通过一些技巧(也就是不重组我的表格)来加快速度?当我稍微修改(不加入较小的表)该语句为
SELECT r.id_customer, COUNT(*)
FROM tag_rating AS r USE INDEX (value_date_add)
WHERE
r.value = 1 AND
r.date_add > '2020-08-08 11:56:00'
GROUP BY r.id_customer;
这是解释选择
+----+-------------+-------+------+----------------+----------------+---------+-------+--------+---------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------+----------------+---------+-------+--------+---------------------------------------------------------------------+
| 1 | SIMPLE | r | ref | value_date_add | value_date_add | 1 | const | 449502 | Using index condition; Using where; Using temporary; Using filesort |
+----+-------------+-------+------+----------------+----------------+---------+-------+--------+---------------------------------------------------------------------+
大约需要 0.25 秒,这很棒。所以 JOIN 使它慢了 3 倍。这是不可避免的吗?我觉得因为我是通过主键加入的,所以它不应该让查询慢 3 倍。
- -更新 - -
这实际上是我的查询。不同的 id_customer 值的数量约为 1000 并且预计会增加,value=1 的行数正好是一半。到目前为止,查询性能似乎根据评级表中的行数线性减慢
在 id_customer_value_date_add 或 value_id_customer_date_add 索引末尾添加 id_pair 没有帮助。
SELECT r.id_customer, ROUND(AVG(tp.percentile_weighted), 2) AS percentile
FROM tag_rating AS r USE INDEX (id_customer_value_date_add)
JOIN tag_product AS tp ON (tp.id_pair = r.id_pair)
WHERE
r.value = 1 AND
r.id_customer IN (2593179,1461878,2318871,2654090,2840415,2852531,2987432,3473275,3960453,3961798,4129734,4191571,4202912,4204817,4211263,4248789,765650,1341317,1430380,2116196,3367674,3701901,3995273,4118307,4136114,4236589,783262,913493,1034296,2626574,3574634,3785772,2825128,4157953,3331279,4180367,4208685,4287879,1038898,1445750,1975108,3658055,4185296,4276189,428693,4248631,1892448,3773855,2901524,3830868,3934786) AND
r.date_add > '2020-08-08 11:56:00'
GROUP BY r.id_customer
这是解释选择
+----+-------------+-------+--------+----------------------------+----------------------------+---------+----------------------------------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------------------+----------------------------+---------+----------------------------------+--------+--------------------------+
| 1 | SIMPLE | r | range | id_customer_value_date_add | id_customer_value_date_add | 10 | | 558906 | Using where; Using index |
+----+-------------+-------+--------+----------------------------+----------------------------+---------+----------------------------------+--------+--------------------------+
| 1 | SIMPLE | tp | eq_ref | PRIMARY,status | PRIMARY | 4 | dev.r.id_pair | 1 | Using where |
+----+-------------+-------+--------+----------------------------+----------------------------+---------+----------------------------------+--------+--------------------------+
任何提示表示赞赏。谢谢
解决方案
INDEX(value, date_add, id_customer, id_pair)
将是“覆盖”,从而为这两个查询提供额外的性能提升。也适用于戈登的配方。
同时,摆脱这些:
KEY `value` (`value`),
KEY `value_date_add` (`value`,`date_add`),
因为它们可能会妨碍优化器选择新索引。使用这些索引的任何其他查询都将轻松使用新索引。
如果您没有以其他方式使用tag_rating.id
,请将其删除并提升UNIQUE
到PRIMARY KEY
.
推荐阅读
- python - 为什么我没有进入 python3 的 while 循环?
- python - 为什么在 openCV 和 python 中将 fps 从 30 翻倍到 60 时,生成的视频时间只有一半?
- php - 使用子字符串搜索数组的有效方法
- sql - 如何查询 MS-Access 中只有最后一个明细行的主明细结果?
- responsive-design - Material ui makestyles 中的响应式排版
- reactjs - 仅在第二次 onPress 后更新状态
- java - Java.lang.ClassNotFoundException:在路径上找不到类“com.example.dependencyinjection.MainActivity”
- php - PHP日期更改strtotime不能正常工作
- database-design - SqlAlchemy Core:多次定义一对一关系
- c# - 将参数从视图模型传递到另一个