首页 > 解决方案 > 在 PostgreSQL 12 上使用 WITH 查找唯一值、计算重复项并对其进行排名

问题描述

我有 3 个复杂的表。对于这个问题,我将简化用法。我需要排名、计数(重复)和唯一记录(结果)。它适用于单个表,但是,当WITH包含并INNER JOIN给出另一个表时,我不再获得任何记录。

表:

CREATE TABLE public.emails (
  id                bigint NOT NULL PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY
    (MAXVALUE 9223372036854775807),
  sender            jsonb NOT NULL
);


CREATE TABLE public.contacts (
  id                bigint NOT NULL PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY
    (MAXVALUE 9223372036854775807),
  email            text NOT NULL,
  full_name            text NOT NULL
);
-- sample data
insert into emails (sender) VALUES ('{"email": "dennis1@example.com", "name": "dennis1"}');
insert into emails (sender) VALUES ('{"email": "dennis1@example.com", "name": "dennis1"}');

insert into contacts (email, full_name) VALUES ('dennis1@example.com', 'dennis1');
insert into contacts (email, full_name) VALUES ('dennis1@example.com', 'dennis1');
insert into contacts (email, full_name) VALUES ('dennis5@example.com', 'dennis5');
insert into contacts (email, full_name) VALUES ('john@example.com', 'john');

预期结果:

email                   name        rk      count

dennis1@example.com     dennis1     1       4
dennis5@example.com     dennis5     1       1
john@example.com        john        1       1

但是,我遇到了两个问题:

  1. INNER JOIN结果为零
  2. ORDER BY "count"不起作用。

我需要的?

如您所见,表格不同。一个表有jsonb列,另一个存储为text. 因此,我分别在每个SELECT查询中提取它们然后进行比较。

所以我需要的是,获取所有电子邮件和姓名,将它们独一无二,计算它们是否重复和排名。我不需要重复的条目,而是将它们合并到count.

我怎么解决这个问题?

演示

在此处查看演示:https ://dbfiddle.uk/?rdbms=postgres_12&fiddle=b79700f74bbf14e190d5f5bf7fcd0670

标签: sqlpostgresqlpostgresql-12

解决方案


在分组和应用窗口函数之前提取 json 并合并两个数据集。

WITH united as (
    SELECT email, full_name FROM contacts
    UNION ALL
    SELECT sender->>'email', sender->>'name' FROM emails
)
SELECT
  email
, full_name
, count(*) count, row_number() over (partition by email) rk
FROM united
GROUP BY 1, 2;
        email        | full_name | count | rk
---------------------+-----------+-------+----
 dennis1@example.com | dennis1   |     4 |  1
 dennis5@example.com | dennis5   |     1 |  1
 john@example.com    | john      |     1 |  1
(3 rows)

推荐阅读