首页 > 解决方案 > 重复中的sql重复

问题描述

很难说出我的问题。但是在没有示例的情况下,在重复项中找到重复项几乎是我所能得到的。

ID,stringA,stringB,dID,oID
1,hello,apple,1,1
2,bye,apple,2,1
3,world,banana,1,1

我正在尝试创建一个查询,该查询将为我提供唯一和不唯一以及OR不唯一oID的所有行。当找到不唯一的值并且行需要匹配以进行该比较时。如果不匹配,我们不关心那些重复。1dIDstringAstringBstringAstringBdIDdID

我对上述数据集的预期查询不会返回任何内容。因为没有dID不唯一的行,但也不具有stringAor的唯一值stringB

ID,stringA,stringB,dID,oID
1,hello,apple,1,1
2,bye,apple,2,1
3,world,banana,1,1
4,hello,null,1,1

我对这个给定数据集的预期查询将返回第 1 行和第 4 行。因为它们oIDdID匹配和stringA匹配。

我尝试在字段上使用嵌套选择having count(*) > 1,但它不断从第一个数据集中返回第 1 行。

select * from table1
                    where oID = 1 and
                        dID in (select dID from table1 where oID = 1 and dID > 0 group by dID having count(*) > 1) and
                        (
                            stringA in (select stringA from table1 where oID = 1 and dID > 0 group by stringA having count(*) > 1) or
                            stringB in (select stringB from table1 where oID = 1 and dID > 0 group by stringA having count(*) > 1)
                        );

任何帮助表示赞赏,谢谢!


更正了结果集。

增加了条件的清晰度。


我认为这个非常丑陋的查询正在运行,但事实证明它并没有完全运行。

select * from t where stringB in (select stringB from (select * from (select * from t where (stringB in (select stringB from t where oID = 1 and dID > 0 group by stringB having count(*) > 1) or
                            stringA in (select stringA from t where oID = 1 and dID > 0 group by stringA having count(*) > 1))
and dID in (select dID from t where oID = 1 and dID > 0 group by dID having count(*) > 1)) as foo) as bar where oID = 1 and dID > 0 group by stringB having count(*) > 1) or
                            stringA in (select stringA from (select * from (select * from t where (stringB in (select stringB from t where oID = 1 and dID > 0 group by stringB having count(*) > 1) or
                            stringA in (select stringA from t where oID = 1 and dID > 0 group by stringA having count(*) > 1))
and dID in (select dID from t where oID = 1 and dID > 0 group by dID having count(*) > 1)) as foo1) as bar1 where oID = 1 and dID > 0 group by stringA having count(*) > 1);

标签: sqlpostgresql

解决方案


由于没有迹象表明您关心存在多少“重复项”,因此您可以使用 EXISTS 至少有 1 个“重复项”。所以:

select * 
  from some_duplicates sd1 
 where oid = 1
   and (   sd1.StringA is not null
        or sd1.StringB is not null
       )
   and exists 
       ( select null 
           from some_duplicates sd2 
          where sd2.oid = sd1.oid
            and sd2.id != sd1.id
            and sd2.did = sd1.did
            and (   sd2.StringA is not null
                 or sd2.StringB is not null
                )
            and (   (sd2.StringA is not distinct from sd1.StringA)
                 or (sd2.StringB is not distinct from sd1.StringB) 
                )
       );

在这里看小提琴


推荐阅读