postgresql - PostgreSQL: Drop in performance with multiple running functions without hardware bottleneck
问题描述
I have simple function, if I run it it takes around 40 second to finish.
select * from f_cyklus1(100000000)
but if I run this function 8 times in 8 separated instances, meaning all 8 function are running in parallel it takes around 210 to 260 seconds to finish for each of it's instances. Which is a massive drop in performance. I tried to compile it as 8 individual functions and run it again but it had no change in performance.
select * from f_cyklus1(100000000);
select * from f_cyklus2(100000000);
select * from f_cyklus3(100000000);
select * from f_cyklus4(100000000);
select * from f_cyklus5(100000000);
select * from f_cyklus6(100000000);
select * from f_cyklus7(100000000);
select * from f_cyklus8(100000000);
So why it takes 40s compare to 210-260s to finish? Our virtual machine has 16 CPUs and physical hardware was at low usage. I was also the only one using the Postgre database at the time of testing.
create or replace function f_cyklus1 (p_rozsah int) returns bigint as -- drop function f_cyklus(int)
$body$
declare
declare
v_exc_context TEXT;
v_result INTEGER;
p_soucet bigint :=0;
begin
for i in 0..p_rozsah
loop
p_soucet = p_soucet + i;
end loop;
return p_soucet;
EXCEPTION
WHEN OTHERS THEN
GET STACKED DIAGNOSTICS v_exc_context = PG_EXCEPTION_CONTEXT;
PERFORM main.ut_log('ERR', SQLERRM || ' [SQL State: ' || SQLSTATE || '] Context: ' || v_exc_context );
RAISE;
END;
$body$ LANGUAGE plpgsql
PostgreSQL 11.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
Virtual machine: Centos 7 + KVM
HW: 2x AMD EPYC 7351 + 256 GB RAM
Note: I already asked similar question where I thought it was due to asynchronous processing, but this shows the problem is actually in raw Postgres performance, therefore I deleted my former question and asked this new one.
解决方案
p_soucet = p_soucet + i;
每次执行此操作时,它都必须获取运行语句的“快照”,因为它在后台使用常规 SQL 引擎,并且始终需要在快照中运行。获取快照需要系统范围的锁定。您同时运行的进程越多,它们花费更多时间来获取快照,而不是做有用的工作。
如果您在设置为“可重复读取”的事务中运行该函数,您会发现它们的扩展性更好,因为它们在持续时间内保持相同的快照并继续重复使用它。当然,这可能会干扰您的实际用例。
plpgsql 并不是很适合这种工作,暂时搁置。您可以使用其他 pl 语言之一,例如 plperl 或 plpythonu。
在https://www.postgresql.org/docs/current/plpgsql-expressions.html中描述了主 SQL 引擎如何评估表达式
从https://www.postgresql.org/docs/current/mvcc.html开始的文档中一般会讨论快照
我不知道最终用户在任何地方都记录了两者之间的交互。
推荐阅读
- c# - 如何将 ItemTemplate 的 TextBox 文本正确绑定到 List 的字符串元素?
- json - 如何修复跨站脚本弱点(在 JSON 响应中持续存在并在 Json 响应中反映)
- flutter - 有没有办法在 Flutter 中使用谷歌地图创建可搜索区域(使用 google_maps_flutter)?
- regex - 检查字符串 bash 中是否存在 2 个子字符串
- c# - C# 从 C++ cfg 文件导入图像坐标
- python-3.x - 使用 Python 通过 Kerberos 身份验证连接到 HDFS
- c# - 如何使用从控制器保留的文本框值重定向回视图上的表单
- typescript - 在ngrx实体中请告诉我什么类型
处于实体状态 - mysql - 无法在本地实例 MySQL 中构建架构
- html - 在 django 服务器中运行时 html 模板的 css 问题