首页 > 解决方案 > 数百万条记录的 Postgresql 数据库变慢(选择)

问题描述

尝试获取数百万条记录的数据时,Postgresql 数据库变慢。我尝试使用Materialized view,但是性能非常快,但它没有提供实时数据。

我也在使用聚合前。sum、count、group by 子句等...

SELECT offer_id as off_id,
  COUNT(distinct ip) as hosts,
  COUNT(distinct click_id) as clicks 
FROM offer_affiliate_stats 
WHERE 
  created_dt >= '2019-06-01' 
AND 
  created_dt  <= '2019-06-30' 
GROUP BY off_id;

我试过Materialized view

索引应用于 id、created_dt、click_id

我的输出应该是这样的:

off_id               | 79
hosts                | 4
clicks               | 4
offer_name           | "Testing Javelin"
offer_id             | 
total_conversions    | 
total_income         | 
optimised_count      | 
optimised_income     | 
approved_income      | 
approved_conversions | 
declined_income      | 
declined_conversions | 
total_payout         | 

实际上,如果不使用distinct关键字,它可以完美地工作,但是当我使用distinct时需要很长时间。

标签: postgresqlquery-performance

解决方案


你应该配置你的数据库吗?

您会看到以下链接:https ://www.postgresql.org/docs/current/runtime-config-resource.html

特别是work_mem默认值是 4MB。您可以增加到 100MB。

您更改代码如下:

SELECT  offer_id as off_id,
        COUNT(ip) as hosts,
        COUNT(click_id) as clicks 

FROM 
(select distinct offer_id,
        ip ,
        click_id    
        from offer_affiliate_stats 
WHERE created_dt >= '2019-06-01' 
AND created_dt  <= '2019-06-30' ) as t
GROUP BY off_id;

推荐阅读