mysql - 删除 MySQL 中所有列匹配的重复行。保留最新行
问题描述
这是一个话题,在这里已经讨论过好几次了,但对我来说很难理解如何删除 MySQL 数据库中的重复行。是的,我见过很多例子,但它们被简化为 2 列等,解决方案大部分时间只是查询,没有解释。因为我想安全地清理我们的数据库。我已经备份了我的表,现在我不想删除重复的行(因为我有很多行)。下面是我尝试过的查询片段,它相当简单: 如果重复的行,选择最新的行并显示该行的 ID。
SELECT MAX(id) FROM Raw_Validated_backup GROUP BY col1, col2, col3, ... col_n-1 having COUNT(*) > 1;
现在,由于我选择了最新的重复行,我“只需”删除重复的行。如果你问我,更容易说完成。这是@Jose Rui Santos 编写的查询:
delete test
from test
inner join (
select max(id) as lastId, email
from test
where email in (
select email
from test
group by email
having count(*) > 1
)
group by email
) duplic on duplic.email = test.email
where test.id < duplic.lastId;
他回复了这个帖子:MySQL删除重复记录但保持最新
问题是,我如何准确删除重复的行并保持最新?更喜欢采用伪格式,因此其他人也可以从中学习。例如:
DELETE table FROM database
而不是具体的像:
DELETE email FROM city
MySQL 查询
示例 1:
DELETE t1 FROM Raw_Validated_backup AS t1 INNER JOIN Raw_Validated_backup AS t2
ON t1.time_start=t2.time_start
AND t1.time_end=t2.time_end
AND t1.first_temp_lpn=t2.first_temp_lpn
AND t1.first_WL=t2.first_WL
AND t1.first_temp_lpn_validated=t2.first_temp_lpn_validated
AND t1.second_temp_lpn=t2.second_temp_lpn
AND t1.second_WL=t2.second_WL
AND t1.second_temp_lpn_validated=t2.second_temp_lpn_validated
AND t1.third_temp_lpn=t2.third_temp_lpn
AND t1.third_WL=t2.third_WL
AND t1.third_temp_lpn_validated=t2.third_temp_lpn_validated
AND t1.first_temp_rising=t2.first_temp_rising
AND t1.first_WR=t2.first_WR
AND t1.first_temp_rising_validated=t2.first_temp_rising_validated
AND t1.second_temp_rising=t2.second_temp_rising
AND t1.second_WR=t2.second_WR
AND t1.second_temp_rising_validated=t2.second_temp_rising_validated
AND t1.third_temp_rising=t2.third_temp_rising
AND t1.third_WR=t2.third_WR
AND t1.third_temp_rising_validated=t2.third_temp_rising_validated
AND t1.id<t2.id;
示例 2:
DELETE FROM Raw_Validated_backup WHERE id NOT IN (
SELECT max(id) FROM Raw_Validated_backup
GROUP BY time_start, time_end, first_temp_lpn, first_WL, first_temp_lpn_validated, second_temp_lpn, second_WL, second_temp_lpn_validated, third_temp_lpn, third_WL, third_temp_lpn_validated, first_temp_rising, first_WR, first_temp_rising_validated, second_temp_rising, second_WR, second_temp_rising_validated, third_temp_rising, third_WR, third_temp_rising_validated )
解决方案
我会这样做:
mysql> select * from test;
+----+------------------+------+------+------+
| id | email | col2 | col3 | col4 |
+----+------------------+------+------+------+
| 1 | user@example.org | 2 | 3 | 4 |
| 2 | user@example.org | 2 | 3 | 4 |
| 3 | user@example.org | 2 | 3 | 4 |
| 4 | user@example.org | 2 | 3 | 4 |
+----+------------------+------+------+------+
mysql> DELETE t1 FROM test AS t1 INNER JOIN test AS t2
ON t1.email=t2.email AND t1.col2=t2.col2 AND t1.col3=t2.col3 AND t1.col4=t2.col4
AND t1.id<t2.id;
mysql> select * from test;
+----+------------------+------+------+------+
| id | email | col2 | col3 | col4 |
+----+------------------+------+------+------+
| 4 | user@example.org | 2 | 3 | 4 |
+----+------------------+------+------+------+
解释:join 中的条件查找t2
id 较大且所有其他列相等的行。如果找到这样的行,那么t1
一定不是重复项中id最大的行,因此t1
应该删除行。
推荐阅读
- google-maps - 如何在 Shopify Atlantic 主题中将 Iframe 地图大小更改为全屏
- android - 如何通过按下它来更改按钮ID?
- spring - Spring Boot 2 执行器千分尺设置
- numpy - 如何从 IPython 打开在线参考?
- c++ - 如何使用 cout 在类中显示类中的值?
- javascript - 比较对象数组并分配相似度分数
- reactjs - 无法让 Material UI 单选按钮与 Formik 第二部分一起使用
- reactjs - 用 let 声明 Hook
- python - 为什么在通信()完成后子进程继续运行?
- php - Laravel如何将数据插入到彼此具有外键的2个表中