首页 > 解决方案 > Are there any negative performance or functionality downsides to using pg_upgrade with --link option afterwards?

问题描述

I'm about upgrade a quite large PostgreSQL cluster from 9.3 to 11.

The upgrade

The cluster is approximately 1,2Tb in size. The database has a disk system consisting of a fast HW RAID 10 array of 8 DC-edition SSDs with 192GB ram and 64 cores. I am performing the upgrade by replicating the data to a new server with streaming replication first, then upgrading that one to 11.

I tested the upgrade using pg_upgrade with the --link option, this takes less than a minute. I also tested the upgrade regularly (without --link) with many jobs, that takes several hours (+4).

Questions

Now the obvious choice is of cause for me to use the --link option, however all this makes me wonder - is there any downsides (performance or functionality wise) to using that over the regular slower method? I do not know the internal workings of postgresql data structures, but I have a feeling there could be a performance difference after the upgrade between rewriting the data entirely and to just using hard links - whatever that means?

Considerations

The only thing I can find in the documentation about the drawbacks of --link is the downside of not being able to access the old data directory after the upgrade is performed https://www.postgresql.org/docs/11/pgupgrade.htm However that is only a safety concern and not a performance drawback and doesn't really apply in my case of replicating the data first. The only other thing I can think of is reclaiming space, with whatever performance upsides that might have. However as I understand it, that can also be achieved by running a VACUUM FULL DATABASE (or CLUSTER?) command after the --link-upgraded database has been upgraded? Also the reclaiming of space is not very impactful performance wise on an SSD as I understand.

I appreciate if anyone can help cast some light into this.

标签: postgresqldatabase-performancepg-upgrade

解决方案


使用硬链接绝对没有缺点(除了您提到的旧集群已死并且必须删除)。

硬链接与普通文件没有任何不同。

UNIX 中的“文件”实际上是一个“inode”,一种包含文件元数据的结构。目录中的条目是指向该 inode 的(硬)链接。

如果您创建另一个指向 inode 的硬链接,同一个文件将位于两个不同的目录中,但这对文件的行为没有任何影响。

当然,您必须确保不要同时启动唯一服务器和新服务器。即时数据损坏将随之而来。这就是为什么您应该尽快移除旧集群的原因。


推荐阅读