HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Postgres requested WAL segment has already been removed (however it is actually in the slave's directory)

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
directorythepostgresremovedhoweverslavebeensegmenthasalready

Problem

I am using repmgr as my replication tool. On the slave I keep getting an error:


requested WAL segment has already been removed

When I check the Master indeed it is not there; however, it is in the slave's directories both in pg_xlogs and pg_xlogs/archive_status. I can't understand why it would be looking for this file if it's already in the slave?

In fact it has xlogs going past the requested one. The solutions in What to do with WAL files for Postgres Slave reset are for a slightly different problem. They seem to be for a scenario where the master deletes a log file before the slave receives it. In my case it is very much present on the slave and several other files in the sequence after the one being requested.

This also tells me I do not need to increase the keep wal segments option as it didn't seem to fall behind?

Solution

From Streaming Replication in the PostgreSQL documentation:


If you use streaming replication without file-based continuous
archiving, the server might recycle old WAL segments before the
standby has received them. If this occurs, the standby will need to be
reinitialized from a new base backup. You can avoid this by setting
wal_keep_segments to a value large enough to ensure that WAL segments
are not recycled too early, or by configuring a replication slot for
the standby. If you set up a WAL archive that's accessible from the
standby, these solutions are not required, since the standby can
always use the archive to catch up provided it retains enough
segments.

To fix the issue, you have to reinitialize the data from primary server. Remove data directory on slave:

root@replica:~# su postgres
postgres@replica:~# mv /var/lib/postgresql/12/main /var/lib/postgresql/12/main_old


Copy all data from the primary server:

sudo -u postgres pg_basebackup -h [PRIMARY_IP] -D /var/lib/postgresql/12/main -U replication -P -v


if version is 12, Create the standby.signal file, otherwise configure replica.conf:

touch /var/lib/postgresql/12/main/standby.signal


Slave configuration:

listen_addresses = 'localhost,[IP_ADDRESS_OF_REPLIACA_ON_LAN]'              # what IP address(es) to listen on; 
max_connections = 100 # Ensure that this value is the same as the primary's
wal_level = 'replica'
archive_mode = on
archive_command = 'cd .'
primary_conninfo = 'host=[PRIMARY_IP] port=5432 user=replication password=[REPLICATION PASSWORD]'
hot_standby = on

max_wal_senders = 48


How long is the pg_basebackup taking? Remember that segments are generated about every 5 minutes, so if the backup takes an hour, you need at least 12 segments stored. At 2 hours, you need 24 etc., I'd set the value to about 12.2 segments/hour of backup.

https://www.gab.lc/articles/postgresql-12-replication/

Code Snippets

root@replica:~# su postgres
postgres@replica:~# mv /var/lib/postgresql/12/main /var/lib/postgresql/12/main_old
sudo -u postgres pg_basebackup -h [PRIMARY_IP] -D /var/lib/postgresql/12/main -U replication -P -v
touch /var/lib/postgresql/12/main/standby.signal
listen_addresses = 'localhost,[IP_ADDRESS_OF_REPLIACA_ON_LAN]'              # what IP address(es) to listen on; 
max_connections = 100 # Ensure that this value is the same as the primary's
wal_level = 'replica'
archive_mode = on
archive_command = 'cd .'
primary_conninfo = 'host=[PRIMARY_IP] port=5432 user=replication password=[REPLICATION PASSWORD]'
hot_standby = on

max_wal_senders = 48

Context

StackExchange Database Administrators Q#158776, answer score: 6

Revisions (0)

No revisions yet.