首页 > 解决方案 > 查询以从具有重复记录的表中获取最小 id

问题描述

有一个表A。

资源 Legacy_Address_ID 城市 状态码 PIN码 ADR_LINE_1
C T173005029 兰开斯特 功放 17602 N石灰街
小号 84191 兰开斯特 功放 17602 N石灰街
小号 10020 圣克莱尔海岸 心肌梗死 48081 英里路
C 85230429169 圣克莱尔海岸 心肌梗死 48081 英里路
C 8770211698 乔利特 伊利诺伊州 60434 邮政信箱 666
小号 9348710 乔利特 伊利诺伊州 60434 邮政信箱 666

表中的记录基本上具有相同的 City、state、pin、adr_line_1 但不同的 Legacy_Address_ID。我想要如下输出:

Legacy_Address_ID New_Legacy_Address_ID
T173005029 84191
85230429169 10020
8770211698 9348710

基本上,最小地址 id(具有源 S)为 new_legacy_Address_id,较大的(具有源 C)为 legacy_address_id。

我试过了

select * 
from 
    (select 
         legacy_address_id, 
         min(legacy_address_id) over (partition by adr_line_1, city, state_code, pincode 
                                      order by legacy_address_id) as new_legacy_address_id 
     from 
         tableA)
where 
    legacy_address_id <> new_legacy_address_id;

此查询作为输出返回:

Legacy_Address_ID New_Legacy_Address_ID
T173005029 84191
85230429169 10020
9348710 8770211698

最后一条记录未正确返回。有人可以帮我吗?

标签: sqloracle

解决方案


使用PIVOT

SELECT Legacy_Address_ID,
       New_Legacy_Address_ID
FROM   tablea
PIVOT (
  MAX(legacy_address_id)
  FOR source IN (
    'C' AS Legacy_Address_ID,
    'S' AS New_Legacy_Address_ID
  )
)

或者,条件聚合:

SELECT MIN(CASE source WHEN 'C' THEN Legacy_Address_ID END) AS Legacy_Address_ID,
       MIN(CASE source WHEN 'S' THEN Legacy_Address_ID END) AS New_Legacy_Address_ID
FROM   tablea
GROUP BY City, State_Code, Pincode, ADR_LINE_1

或者,使用带有分析函数的条件聚合:

select * 
from   (
  select legacy_address_id, 
         min(CASE source WHEN 'S' THEN legacy_address_id END) over (
           partition by adr_line_1, city, state_code, pincode
         ) as new_legacy_address_id,
         source
  from   tableA
)
where  source = 'C';

其中,对于您的示例数据:

CREATE TABLE tablea (Source, Legacy_Address_ID, City, State_Code, Pincode, ADR_LINE_1 ) AS
SELECT 'C', 'T173005029',  'LANCASTER',          'PA', 17602, 'N LIME ST'  FROM DUAL UNION ALL
SELECT 'S', '84191',       'LANCASTER',          'PA', 17602, 'N LIME ST'  FROM DUAL UNION ALL
SELECT 'S', '10020',       'SAINT CLAIR SHORES', 'MI', 48081, 'MILE RD'    FROM DUAL UNION ALL
SELECT 'C', '85230429169', 'SAINT CLAIR SHORES', 'MI', 48081, 'MILE RD'    FROM DUAL UNION ALL
SELECT 'C', '8770211698',  'JOLIET',             'IL', 60434, 'PO BOX 666' FROM DUAL UNION ALL
SELECT 'S', '9348710',     'JOLIET',             'IL', 60434, 'PO BOX 666' FROM DUAL;

所有输出:

LEGACY_ADDRESS_ID NEW_LEGACY_ADDRESS_ID
8770211698 9348710
T173005029 84191
85230429169 10020

db<>在这里摆弄


推荐阅读