首页 > 解决方案 > How to bring back the replication system in postgresql if the master ip address has been changed in ubuntu?

问题描述

Postgresql database replication has two servers one for master and the other for a slave. Due to some reason the master IP address got changed which was being used at several places in the slave server. With the new IP address, after replacing the old ones with the latest one in the slave server the replication is not working as before. Can someone help to resolve this issue? Following are the steps used in setting up the slave server :

1.add the master IP address in the pg_hba.conf file for the user replication

nano /etc/postgresql/11/main/pg_hba.conf host
    replication  master-IP/24  md5

2.modify the following lines in the PostgreSQL.conf file of slave server where listen_addresses should be the IP of the slave server
    nano /etc/postgresql/11/main/postgresql.conf
    listen_addresses = 'localhost,slave-IP'
    wal_level = replica
    max_wal_senders = 10
    wal_keep_segments = 64

3. Take the backup of the master server by entering the IP


pg_basebackup -h master-ip -D /var/lib/postgresql/11/main/ -P -U
    replication --wal-method=fetch

4.create a recovery file and adding the following commands


 standby_mode          = 'on'
    primary_conninfo      = 'host=master-ip port=5432 user=replication password= '
    trigger_file = '/tmp/MasterNow'

Below is the error from the log file:

started streaming WAL from primary at A/B3000000 on timeline 2
FATAL:  could not receive data from WAL stream: ERROR:  requested WAL segment 000000020000000A000000B3 has already been removed

FATAL:  could not connect to the primary server: could not connect to server: Connection timed out
        Is the server running on host "master ip" and accepting
        TCP/IP connections on port 5432?

record with incorrect prev-link 33018C00/0 at 0/D8E15D18

标签: postgresqldatabase-replicationmaster-slave

解决方案


备用服务器停机的时间足够长,以至于主服务器不再具有所需的事务日志信息。

补救措施有以下三种:

  1. 在备用服务器的恢复配置中设置restore_command参数以从存档中恢复 WAL 段(这应该与archive_command主服务器上的相反)。然后重新启动备用。

    这是唯一允许您在不从头开始重建备用服务器的情况下进行恢复的选项。

  2. 在主服务器上设置wal_keep_segments足够高,以保留足够的 WAL 来覆盖中断。

    这不会帮助您现在恢复,但可以避免将来出现问题。

  3. 在主服务器上定义一个物理复制槽,并将其名称放在primary_slot_name备用服务器恢复配置的参数中。

    这不会帮助您现在恢复,但可以避免将来出现问题。

    注意:使用复制槽时,请监控复制。否则,一个关闭的备用将导致 WAL 段在主上堆积,最终填满磁盘。

除了第一个选项之外,所有选项都要求您使用 重建备用pg_basebackup数据库,因为所需的 WAL 信息不再可用。


推荐阅读