HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

pg_switch_wal In Point In Time Recovery PostgreSQL

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgresqlpointtimerecoverypg_switch_wal

Problem

I was following this tutorial about point in time recovery in postgres and come up with the following questions. Can anybody help me.

  • Is it necessary to use pg_basebackup every time I want to back up the database or the postgres backs up the database and stores somewhere considering its configurations?



  • What does the psql -c "select pg_switch_wal();"do? and is it necessary to run every time I want to back up the database?



I use postgres 11 on ubuntu 18.04.3 if that helps.

EDIT

Let me be more specific in my question. As I've understood by searching on the internet, we should have two files to back up the database: one base which contains the general files such as table attributes and a WAL file which contain the log files showing instances in the database. And I have seen that pg_basebackup backs up whole database cluster not only WAL files, so there should be a function which only backs up WAL files.

Thanks

Solution

In theory, you can run "pg_basebackup" just once (or once per major version upgrade) but then it will take longer and longer to do a restore, as it has to run through all the accumulated WAL since that backup. So it usually makes sense to do a new pg_basebackup periodically.

"The database" won't do it for you. There are a variety of backup tools out there that probably will (pg_barman, pgbackrest, pg_probackup, etc.). But you didn't mention using any of them.

pg_switch_wal() is not necessary as pg_basebackup does it for you.

pg_basebackup will archive the wal files if you set it to, but will only do so for WAL files which are generated during the duration of the base backup itself. This is suitable for a snapshot backup, but not for PITR. In other words, it will let you restore to the exact moment the backup command finished, but is not enough to let you restore to some point after that time. To backup WAL files at other times, you need to set the "archive_command" setting to something that will copy the files for you. (Also, if you have a valid "archive_command" in place before you start the basebackup, then the basebackup itself doesn't really need to backup the WAL in the first place. Then the base backup will not be self contained, you will have to pair it with the WAL archive in order to restore.)

The tutorial you refer to shows a setting for "archive_command". However, the setting it shows is not bullet-proof. As the docs say, "this is an example, not a recommendation". It has two main problems. Once is that it does not fsync the copy before declaring success. That means the system can remove/recycle the WAL file from pg_wal, before the copy of it has been flushed to the disk drive; and so a crash at the wrong time can leave you with zero valid copies. The other problem is that it seems to be copying the file locally. That means that if your computer is destroyed, you will lose your backup at the same time as you lose your master copy. (It is possible that it is not copying locally, because /var/lib/postgresql/pg_log_archive might be a network mounted file-system; but if so then that exacerbates the fsync problem, how do you know the copy on the remote file system is truly safe?)

If you don't want to use "archive_command", then another tool is "pg_receivewal". You would run this on another server, and it will connect to your Postgresql instance and "pull" the wal files, rather than the "push" that "archive_command" does.

Yet another option is to set up a streaming standby on another server, set "archive_mode" to "always", and then have "archive_command" run on the standby, rather than on the master, to save your WAL files.

Context

StackExchange Database Administrators Q#250782, answer score: 4

Revisions (0)

No revisions yet.