首页 > 解决方案 > 有条件地从 postgres 中的数据库中删除重复项

问题描述

我想从“值”列中删除重复项,但前提是上次更新没有更改。我阅读了有关滞后和领先的教程,但找不到删除重复项的示例。

原来的:

+----+-------+-------+------------------------+
| ID | subID | value |       updated_at       |
+----+-------+-------+------------------------+
|  1 |     2 | 2.20  | 2020-02-16 07:36:25+01 |
|  1 |     2 | 2.20  | 2020-02-16 07:31:25+01 |
|  1 |     2 | 2.20  | 2020-02-16 07:26:25+01 |
|  1 |     2 | 2.30  | 2020-02-16 07:21:25+01 |
|  1 |     2 | 2.20  | 2020-02-16 07:16:25+01 |
|  1 |     2 | 2.20  | 2020-02-16 07:11:25+01 |
+----+-------+-------+------------------------+

期望的输出:

+----+-------+-------+------------------------+
| ID | subID | value |       updated_at       |
+----+-------+-------+------------------------+
|  1 |     2 | 2.20  | 2020-02-16 07:36:25+01 |
|  1 |     2 | 2.30  | 2020-02-16 07:21:25+01 |
|  1 |     2 | 2.20  | 2020-02-16 07:16:25+01 | 
+----+-------+-------+------------------------+

标签: databasepostgresqlselectwindow-functionsgaps-and-islands

解决方案


我会使用滞后或领先并由 ctid 删除:

DELETE FROM yourtable WHERE ctid IN
(
  SELECT
    ctid
  FROM 
  (
    SELECT 
      ctid,
      value,
      LAG(value) OVER(PARTITION BY id, subid ORDER BY updated_at) pre
    FROM 
      yourtable t
  ) t
  WHERE value = pre 
)

与来自 Internet 的任何删除查询一样,针对表的副本运行它...


推荐阅读