首页 > 解决方案 > 选择上一个日期的所有值,这些值在按值分组的行之间共享

问题描述

我有一个 Postgresql 表,其中列出了随着时间的推移国家及其大陆的值列表。值可以为 NULL。我想随着时间的推移获得每个大陆的总和,直到每个大陆都有数据的最新日期。

这是我的表(在 DB Fiddle 上查看):

| continent | country | date       | value | id  |
| --------- | ------- | ---------- | ----- | --- |
| Europe    | Germany | 2020-05-25 | 10    | 1   |
| Europe    | Germany | 2020-05-26 | 11    | 2   |
| Europe    | Germany | 2020-05-27 | 12    | 3   |
| Europe    | Germany | 2020-05-28 | 13    | 4   |
| Europe    | Italy   | 2020-05-25 | 20    | 5   |
| Europe    | Italy   | 2020-05-26 | 21    | 6   |
| Europe    | Italy   | 2020-05-27 | 22    | 7   |
| Europe    | Italy   | 2020-05-28 | 23    | 8   |
| Europe    | France  | 2020-05-25 | 30    | 9   |
| Europe    | France  | 2020-05-26 | 31    | 10  |
| Europe    | France  | 2020-05-27 | 32    | 11  |
| Europe    | France  | 2020-05-28 | NULL  | 12  |
| Africa    | Congo   | 2020-05-25 | 40    | 13  |
| Africa    | Congo   | 2020-05-26 | 41    | 14  |
| Africa    | Congo   | 2020-05-27 | NULL  | 15  |

这就是我想要回来的。请注意,欧洲包括截至 27 日的数据,因为法国没有截至 28 日的数据,而非洲截至 26 日,因为这是其国家拥有数据的最后日期。

| continent | date       | value |
| --------- | ---------- | ----- |
| Europe    | 2020-05-27 | 66    |
| Africa    | 2020-05-26 | 41    |
| Europe    | 2020-05-26 | 63    |
| Africa    | 2020-05-25 | 40    |
| Europe    | 2020-05-25 | 60    |

通过包括每个大洲每个日期都有数据的国家/地区数量,我几乎设法到达了那里。

SELECT
    countries.continent,
    countries.date,
    SUM(countries.value) AS value,
    COUNT(countries.country) AS countries_count
FROM
    countries
WHERE
    countries.value IS NOT NULL
GROUP BY
    countries.continent,
    countries.date
ORDER BY
    countries.date DESC,
    countries.continent;
| continent | date       | value | countries_count |
| --------- | ---------- | ----- | --------------- |
| Europe    | 2020-05-28 | 36    | 2               |
| Europe    | 2020-05-27 | 66    | 3               |
| Africa    | 2020-05-26 | 41    | 1               |
| Europe    | 2020-05-26 | 63    | 3               |
| Africa    | 2020-05-25 | 40    | 1               |
| Europe    | 2020-05-25 | 60    | 3               |

我还设法获得了每个大陆的国家数量。

SELECT
    countries.continent,
    COUNT(DISTINCT countries.country) as number_of_countries
FROM
    countries
GROUP BY
    countries.continent;
| continent | number_of_countries |
| --------- | ------------------- |
| Africa    | 1                   |
| Europe    | 3                   |

我被困在如何结合这两个查询来过滤掉没有获得该大陆全部国家/地区数量的行(例如,选择countries_countfor3Europefor1的行Africa

这是我想要返回的最终结果:

| continent | date       | value |
| --------- | ---------- | ----- |
| Europe    | 2020-05-27 | 66    |
| Africa    | 2020-05-26 | 41    |
| Europe    | 2020-05-26 | 63    |
| Africa    | 2020-05-25 | 40    |
| Europe    | 2020-05-25 | 60    |

或者也许有一种完全不同的方式来解决这个问题?

在 DB Fiddle 上查看

标签: sqlpostgresql

解决方案


您可以NOT IN在 WHERE 子句中使用:

SELECT
    c.continent,
    c.date,
    SUM(c.value) AS value,
    COUNT(DISTINCT c.country) AS countries_count
FROM countries c
WHERE date NOT IN 
    ( SELECT date 
        FROM countries 
       WHERE value IS NULL )
GROUP BY c.continent, c.date
ORDER BY c.date DESC, c.continent;

推荐阅读