首页 > 解决方案 > 使用 LEFT JOIN 时 Postgres 因分段错误而崩溃

问题描述

我的 Postgres 版本是 9.6.12。当我运行下面的查询时,Postgres 崩溃并出现以下错误。当我将 LEFT JOIN 替换为 JOIN 时,查询工作正常

proddb=# SELECT 1 - count(event_date) AS result FROM (SELECT now()::date AS run_date) p 
JOIN historic.audit_event ON event_code = 5199
 AND event_param1 = 'fullscriptpendingorders' AND event_date > run_date 
AND event_date < (run_date + '1 day'::interval);
 result
--------
      0
(1 row)

-- When I change the JOIN to LEFT JOIN
proddb=# SELECT 1 - count(event_date) AS result FROM (SELECT now()::date AS run_date) p
**LEFT JOIN** historic.audit_event ON event_code = 5199 
AND event_param1 = 'fullscriptpendingorders' 
AND event_date > run_date 
AND event_date < (run_date + '1 day'::interval);
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

postgres 日志显示以下错误,表明 Postgres 已崩溃。Postgres 会在几分钟后自动恢复,但这会影响关键的生产。你能帮忙吗?这是 Postgres 的错误吗?

2019-04-08 21:32:56 PDT [26911]: [19631-1] [] [] LOG:  00000: server process (PID 23981) was terminated by signal 11: Segmentation fault
2019-04-08 21:32:56 PDT [26911]: [19632-1] [] [] DETAIL:  Failed process was running: SELECT 1 - count(event_date) AS result FROM (SELECT now()::date AS run_date) p LEFT JOIN historic.audit_event ON event_code = 5199 AND event_param1 = 'amsa.fullscriptpendingorders' AND event_date > run_date AND event_date < (run_date + '1 day'::interval);
2019-04-08 21:32:56 PDT [26911]: [19633-1] [] [] LOCATION:  LogChildExit, postmaster.c:3574
2019-04-08 21:32:56 PDT [26911]: [19634-1] [] [] LOG:  00000: terminating any other active server processes
2019-04-08 21:32:56 PDT [26911]: [19635-1] [] [] LOCATION:  HandleChildCrash, postmaster.c:3294
2019-04-08 21:32:56 PDT [24633]: [1-1] [[unknown]] [[unknown]] LOG:  00000: connection received: host=[local]
2019-04-08 21:32:56 PDT [24633]: [2-1] [[unknown]] [[unknown]] LOCATION:  BackendInitialize, postmaster.c:4192
2019-04-08 21:32:56 PDT [24633]: [3-1] [postgres] [emr_prod] FATAL:  57P03: the database system is in recovery mode
2019-04-08 21:32:56 PDT [24633]: [4-1] [postgres] [emr_prod] LOCATION:  ProcessStartupPacket, postmaster.c:2230
2019-04-08 21:32:56 PDT [26911]: [19636-1] [] [] LOG:  00000: all server processes terminated; reinitializing
2019-04-08 21:32:56 PDT [26911]: [19637-1] [] [] LOCATION:  PostmasterStateMachine, postmaster.c:3818

当我在 9.6.9 版的 QA 数据库上有确切的数据集时,我看不到这种行为

postgres@emr_qa07=# select version();
                                                 version
----------------------------------------------------------------------------------------------------------
 PostgreSQL 9.6.9 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16), 64-bit
(1 row)

postgres@emr_qa07=# SELECT 1 - count(event_date) AS result FROM (SELECT now()::date AS run_date) p JOIN historic.audit_event ON event_code = 5199 AND event_param1 = 'fullscriptpendingorders' AND event_date > run_date AND event_date < (run_date + '1 day'::interval);
 result
--------
      1
(1 row)

postgres@emr_qa07=# SELECT 1 - count(event_date) AS result FROM (SELECT now()::date AS run_date) p LEFT JOIN historic.audit_event ON event_code = 5199 AND event_param1 = 'fullscriptpendingorders' AND event_date > run_date AND event_date < (run_date + '1 day'::interval);
 result
--------
      1
(1 row)

标签: postgresql

解决方案


是的,这是一个 PostgreSQL 错误——无论你给它什么查询,它都不应该出现段错误。如果您可以将其简化为最小的测试用例,开发人员可能会对错误报告感兴趣。

这里的另一种可能性是您的数据库的基础数据文件已损坏。PostgreSQL 崩溃仍然是一个错误,但要确定发生了什么要困难得多。潜在的转储和恢复数据库可能会修复它。


推荐阅读