mysql - 当数据存在上百万条数据时,如何让这个查询更高效更快
问题描述
有没有更快速高效显示排名数据的解决方案 我使用的查询感觉很慢,因为数据已经非常多
我有表格和数据:
表格1:
CREATE TABLE `table1` (
`location_id` varchar(5) COLLATE utf8mb4_unicode_ci NOT NULL,
`location_name` varchar(50) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
PRIMARY KEY (`location_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
INSERT INTO `table1` (`location_id`,`location_name`) VALUES
('001','Asia'),('002','Africa');
+-------------+---------------+
| location_id | location_name |
+-------------+---------------+
| 001 | Asia |
| 002 | Africa |
+-------------+---------------+
表 2:
CREATE TABLE `table2` (
`device_id` varchar(5) COLLATE utf8mb4_unicode_ci NOT NULL,
`device_name` varchar(50) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`location_id` varchar(5) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
PRIMARY KEY (`device_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
INSERT INTO `table2` (`device_id`,`device_name`, `location_id`) VALUES
('D001','Water Level','001'),('D002','Air Direction','001'),('D003','Rain Fall','001');
+-----------+---------------+-------------+
| device_id | device_name | location_id |
+-----------+---------------+-------------+
| D001 | Water Level | 001 |
| D002 | Air Direction | 001 |
| D003 | Rain Fall | 001 |
+-----------+---------------+-------------+
表3:
CREATE TABLE `table3` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`device_id` varchar(5) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`send_date` datetime DEFAULT NULL,
`value` double DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
INSERT INTO `table3` (`device_id`,`send_date`, `value`) VALUES
('D001','2021-07-11 07:00',30.10),
('D001','2021-07-11 07:30',29.21),
('D001','2021-07-11 08:00',31.07),
('D002','2021-07-11 07:00',28.18),
('D002','2021-07-11 07:30',30.10),
('D002','2021-07-11 08:00',31.07),
('D003','2021-07-11 07:00',29.21),
('D003','2021-07-11 07:30',31.07),
('D003','2021-07-11 08:00',30.10),
('D001','2021-07-12 07:00',30.10),
('D001','2021-07-12 07:30',29.21),
('D001','2021-07-12 08:00',31.07),
('D002','2021-07-12 07:00',28.18),
('D002','2021-07-12 07:30',30.10),
('D002','2021-07-12 08:00',31.07),
('D003','2021-07-12 07:00',29.21),
('D003','2021-07-12 07:30',31.07),
('D003','2021-07-12 08:00',30.10);
+----------------+---------------------+-------+
| id | device_id | send_date | value |
+--------------- +---------------------+-------+
| 1 | D001 | 2021-07-11 07:00:00 | 30.1 |
| 2 | D001 | 2021-07-11 07:30:00 | 29.21 |
| 3 | D001 | 2021-07-11 08:00:00 | 31.07 |
| 4 | D002 | 2021-07-11 07:00:00 | 28.18 |
| 5 | D002 | 2021-07-11 07:30:00 | 30.1 |
| 6 | D002 | 2021-07-11 08:00:00 | 31.07 |
| 7 | D003 | 2021-07-11 07:00:00 | 29.21 |
| 8 | D003 | 2021-07-11 07:30:00 | 31.07 |
| 9 | D003 | 2021-07-11 08:00:00 | 30.1 |
| 10 | D001 | 2021-07-12 07:00:00 | 30.1 |
| 11 | D001 | 2021-07-12 07:30:00 | 29.21 |
| 12 | D001 | 2021-07-12 08:00:00 | 31.07 |
| 13 | D002 | 2021-07-12 07:00:00 | 28.18 |
| 14 | D002 | 2021-07-12 07:30:00 | 30.1 |
| 15 | D002 | 2021-07-12 08:00:00 | 31.07 |
| 16 | D003 | 2021-07-12 07:00:00 | 29.21 |
| 17 | D003 | 2021-07-12 07:30:00 | 31.07 |
| 18 | D003 | 2021-07-12 08:00:00 | 30.1 |
+----------------+---------------+-------------+
我用这个查询每组显示1条数据,但是当有几百万条数据时,查询会感觉很慢
SELECT * FROM
(SELECT
a.`location_name`,c.`device_id`,b.`device_name`,c.`send_date`,c.`value`,
ROW_NUMBER() OVER (PARTITION BY c.`device_id` ORDER BY c.send_date DESC) AS ranking
FROM
table1 a
INNER JOIN table2 b ON b.`location_id`=a.`location_id`
INNER JOIN table3 c ON c.`device_id`=b.`device_id`
) AS ta
WHERE ta.ranking = 1
ORDER BY ta.device_id ASC
输出:
+---------------+---------------+---------------------+-------+---------+
| location_name | device_name | send_date | value | ranking |
+---------------+---------------+---------------------+-------+---------+
| Asia | Water Level | 2021-07-12 09:00:00 | 28.18 | 1 |
| Asia | Air Direction | 2021-07-12 09:00:00 | 28 | 1 |
| Asia | Rain Fall | 2021-07-12 09:00:00 | 28 | 1 |
+---------------+---------------+---------------------+-------+---------+
请帮帮我。因为如果我在数据达到数百万时使用查询,这个过程会很慢 谢谢你的回答
解决方案
您可能会更快地找到具有正确索引的相关子查询:
SELECT a.`location_name`, c.`device_id`, b.`device_name`, c.`send_date`, c.`value`
FROM table1 a JOIN
table2 b
ON b.`location_id` = a.`location_id` JOIN
table3 c
ON c.`device_id` = b.`device_id`
WHERE c.send_date = (SELECT MAX(c2.send_date)
FROM table3 c2
WHERE c2.device_id = c.device_id
);
您想要的索引位于table3(device_id, send_date)
:
CREATE INDEX idx_table3_device_id_send_date ON table3(device_id, send_date);
推荐阅读
- spring-boot - Spring-Boot 中的 PropertyNotFoundException
- yii2 - Yii2 Pjax 即使使用 renderPartial 也会重新加载页面
- c++ - 无法在 Linux 上使用 OpenSll 编译 C++ 应用程序
- python - 并行调用多个对象的方法
- c++ - 如何将一个窗口保持在另一个应用程序窗口的前面
- python - Ruby 与 pythons xrange 的等价物是什么?
- r - R Keras: Convert tensorflow tensor to R array
- java - Java: list1=list2 // list1==list2?
- php - 无法从 CodeIgniter 中的循环内的表单元素中获取数组数据
- django - 表单中的 django 布尔值始终返回 true