snippetsqlMinor
How to determine whether a PostgreSQL Hot Standby is fully mirrored?
Viewed 0 times
postgresqlmirroredhowstandbyfullydeterminehotwhether
Problem
I have set up a hot standby of a PostgreSQL server. It all seems to be working, but I just want to be sure that I'm not missing something. In
I am concerned about the missing WAL files. Can anyone confirm that, as long as it reaches a consistent state, the hot standby contains all the data of the master?
Everything else I check indicates that it's ok; for example, running
/var/lib/pgsql/9.2/data/pg_log/postgresql-Wed.log I have the following:LOG: creating missing WAL directory "pg_xlog/archive_status"
cp: cannot stat `/var/lib/pgsql/9.2/wal/00000002.history': No such file or directory
LOG: entering standby mode
cp: cannot stat `/var/lib/pgsql/9.2/wal/0000000200000031000000B4': No such file or directory
LOG: streaming replication successfully connected to primary
LOG: redo starts at 31/B47BFAC0
LOG: consistent recovery state reached at 31/B73624A0
LOG: database system is ready to accept read only connectionsI am concerned about the missing WAL files. Can anyone confirm that, as long as it reaches a consistent state, the hot standby contains all the data of the master?
Everything else I check indicates that it's ok; for example, running
psql -x -c "select * from pg_stat_replication;" on the master looks good, and adding a new record on the master replicates. I just want to be sure that there won't be anything missing from the slave.Solution
I think this is normal and expected if your
The manual says that:
At startup, the standby begins by restoring all WAL available in the
archive location, calling
WAL available there and restore_command fails, it tries to restore any
WAL available in the pg_xlog directory. If that fails, and streaming
replication has been configured, the standby tries to connect to the
primary server and start streaming WAL from the last valid record
found in archive or pg_xlog. If that fails or streaming replication is
not configured, or if the connection is later disconnected, the
standby goes back to step 1 and tries to restore the file from the
archive again. This loop of retries from the archive, pg_xlog, and via
streaming replication goes on until the server is stopped or failover
is triggered by a trigger file.
So you can expect to see exactly one
Then it will connect to the primary and start streaming as described above, and as you saw in your logs:
The slave is not guaranteed to be exactly up-to-date with the master, because it could be disconnected from the master for example. In particular, this line:
does not mean that "the hot standby contains all the data of the master". However, if you see it followed by this line, as you did:
then the database is "ready enough" to start functioning as a read-only standby, as the manual says:
It may take some time for Hot Standby connections to be allowed,
because the server will not accept connections until it has completed
sufficient recovery to provide a consistent state against which
queries can run. During this period, clients that attempt to connect
will be refused with an error message.
In my case, I saw
restore_command is set to something like this example:restore_command = 'cp /mnt/server/archivedir/%f "%p"'The manual says that:
At startup, the standby begins by restoring all WAL available in the
archive location, calling
restore_command. Once it reaches the end ofWAL available there and restore_command fails, it tries to restore any
WAL available in the pg_xlog directory. If that fails, and streaming
replication has been configured, the standby tries to connect to the
primary server and start streaming WAL from the last valid record
found in archive or pg_xlog. If that fails or streaming replication is
not configured, or if the connection is later disconnected, the
standby goes back to step 1 and tries to restore the file from the
archive again. This loop of retries from the archive, pg_xlog, and via
streaming replication goes on until the server is stopped or failover
is triggered by a trigger file.
So you can expect to see exactly one
restore_command failure when you start your standby, because PostgreSQL will keep calling it (with incrementing log file names/numbers) until it fails once.Then it will connect to the primary and start streaming as described above, and as you saw in your logs:
LOG: streaming replication successfully connected to primaryThe slave is not guaranteed to be exactly up-to-date with the master, because it could be disconnected from the master for example. In particular, this line:
LOG: consistent recovery state reached at 31/B73624A0does not mean that "the hot standby contains all the data of the master". However, if you see it followed by this line, as you did:
LOG: database system is ready to accept read only connectionsthen the database is "ready enough" to start functioning as a read-only standby, as the manual says:
It may take some time for Hot Standby connections to be allowed,
because the server will not accept connections until it has completed
sufficient recovery to provide a consistent state against which
queries can run. During this period, clients that attempt to connect
will be refused with an error message.
In my case, I saw
consistent recovery state reached not followed by database system is ready to accept read only connections. This turned out to be a problem with an embedded scripting language plugin (plpython2) having a system-wide startup script (sitecustomize.py) which did bad things to the PostgreSQL process (enabling faulthandler and installing a signal handler for SIGUSR2) which caused it to never enter hot standby mode.Code Snippets
restore_command = 'cp /mnt/server/archivedir/%f "%p"'LOG: streaming replication successfully connected to primaryLOG: consistent recovery state reached at 31/B73624A0LOG: database system is ready to accept read only connectionsContext
StackExchange Database Administrators Q#97635, answer score: 2
Revisions (0)
No revisions yet.