patternsqlModerate
Make Postgres database temporarily read-only (for performing volume snapshots)
Viewed 0 times
performingpostgresreadmakesnapshotsdatabaseforonlyvolumetemporarily
Problem
The PostgreSQL built-in backup mechanism isn't always very suitable. Sometimes, you want to put the application in a quiescent state, because it has external data with which you want to backup at the same time you back up the PG data. But the only way to put the application in a quiescent state is to "lock" the database also. PG lacks a database-wide or cluster-wide locking mechanism. Putting PG into a read-only state would be a piece in the following solution:
- Quiesce application data (disable logins)
- Quiesce database (by making it read-only)
- Perform a PG checkpoint or pg_xlog_switch()
- Create a snapshot of the App and Data volumes
- Resume the database (make it RW again)
- Resume the application
- Backup the snapshots
Solution
After culling answers elsewhere on the internet, I devised a solution. The other answers were in and of themselves, incomplete. So I am presenting an answer here in hopes it will benefit others.
The Strategy
Once that is done, you would (in my solution):
Pitfalls
-
Restoring the read-write state is not so simple. If read-only connections now exist, you must terminate them in order for the new read-write status to take effect. But new connections might arrive while killing existing ones. So again, you must
Alternate strategy
Another strategy is to change the permissions on the role used by the application. This can be quite messy and is less general.
For instance, you'd have to revoke/re-grant on not just tables, but sequences, large objects, and probably the schema itself. Further, what exactly is the behavior of existing connections when you change the access? Probably no impact, which means you also need to kill those backends. Finally, let's say the application has read-write access to most tables, but not to others in the schema. You'd have to make sure your re-granting doesn't include those objects as well.
Another possibility is to LOCK all the tables, by querying the catalog and performing a dynamic query. That seemed to perilous for my tastes.
Implementation
Pause_service
The database instance name is 'gitlabhq' and the username of the application is 'gitlab'. Replace it with your own:
Resume
There's a possibility that in this last step you will kill long-running read-only/SELECT queries, but in my experience, such long-running queries can last minutes if not hours, and it's acceptable to kill these in order to ensure uptime for everyone else.
The Strategy
- Disable connections to the database (not the cluster).
- Set the database's
default_transaction_read_onlysetting totrue.
- Terminate the existing connections to that database.
- Re-enable (read-only) connections.
Once that is done, you would (in my solution):
- Perform the
CHECKPOINT(I think this is the safest, but apg_xlog_switch()would be appropriate for very high-load servers)
- Take the volume snapshot
- Reverse the previous steps. (But this is tricky!)
Pitfalls
- Terminating connections while they are mid-transaction is probably a bad idea. Better to kill idle connections, wait for a few seconds, then kill idle ones, wait a few more, repeat until they are all gone.
- At some point, you'll have to kill open/hung queries or abort the backup.
- At the start of a transaction of session, Postgresql takes a kind of a snapshot of the process table. You have to reset this snapshot every time you go to check if unwanted processes are still there. See
pg_stat_clear_snapshot()
-
Restoring the read-write state is not so simple. If read-only connections now exist, you must terminate them in order for the new read-write status to take effect. But new connections might arrive while killing existing ones. So again, you must
- Disable connections to the database
- change default_transaction_read_only status to
false
- kill the existing connections
- Re-enable (r/w) connections to the database
Alternate strategy
Another strategy is to change the permissions on the role used by the application. This can be quite messy and is less general.
For instance, you'd have to revoke/re-grant on not just tables, but sequences, large objects, and probably the schema itself. Further, what exactly is the behavior of existing connections when you change the access? Probably no impact, which means you also need to kill those backends. Finally, let's say the application has read-write access to most tables, but not to others in the schema. You'd have to make sure your re-granting doesn't include those objects as well.
Another possibility is to LOCK all the tables, by querying the catalog and performing a dynamic query. That seemed to perilous for my tastes.
Implementation
Pause_service
The database instance name is 'gitlabhq' and the username of the application is 'gitlab'. Replace it with your own:
psql -Upostgres = 10 OR state in ('idle', 'disabled' ));
PERFORM pg_stat_clear_snapshot();
EXIT WHEN NOT EXISTS ( select pid from pg_stat_activity where usename = 'gitlab' );
RAISE NOTICE 'pg backends still open: sleeping 2 seconds';
PERFORM pg_sleep(2);
PERFORM pg_stat_clear_snapshot();
END LOOP;
-- send notice if still open connections
IF EXISTS ( select pid from pg_stat_activity where usename = 'gitlab' ) THEN
RAISE NOTICE 'Hung backends. Backup might not be 100%% consistent';
END IF;
END;$X$;
-- 4. Allow read-only connections while checkpointing/snapshotting
alter database gitlabhq with allow_connections = on;
CHECKPOINT;Resume
alter database gitlabhq_production with allow_connections = off;
alter database gitlabhq set default_transaction_read_only = false;
SELECT pg_stat_clear_snapshot();
SELECT pg_terminate_backend(pid) from pg_stat_activity where usename = 'gitlab';
alter database gitlabhq with allow_connections = on;There's a possibility that in this last step you will kill long-running read-only/SELECT queries, but in my experience, such long-running queries can last minutes if not hours, and it's acceptable to kill these in order to ensure uptime for everyone else.
Code Snippets
psql -Upostgres <<'PAUSE_DB'
-- 1. disable new connections
alter database gitlabhq_production with allow_connections = off;
-- 2. Make DB read-only
alter database gitlabhq set default_transaction_read_only = true;
-- 3. Inobtrusively but safely terminate current connections
DO $X$ BEGIN
-- kill open idle connections, try up to 9x. Last time, kill regardless
FOR i IN 1..10 LOOP
PERFORM pg_terminate_backend(pid) from pg_stat_activity where usename = 'gitlab'
and (i >= 10 OR state in ('idle', 'disabled' ));
PERFORM pg_stat_clear_snapshot();
EXIT WHEN NOT EXISTS ( select pid from pg_stat_activity where usename = 'gitlab' );
RAISE NOTICE 'pg backends still open: sleeping 2 seconds';
PERFORM pg_sleep(2);
PERFORM pg_stat_clear_snapshot();
END LOOP;
-- send notice if still open connections
IF EXISTS ( select pid from pg_stat_activity where usename = 'gitlab' ) THEN
RAISE NOTICE 'Hung backends. Backup might not be 100%% consistent';
END IF;
END;$X$;
-- 4. Allow read-only connections while checkpointing/snapshotting
alter database gitlabhq with allow_connections = on;
CHECKPOINT;alter database gitlabhq_production with allow_connections = off;
alter database gitlabhq set default_transaction_read_only = false;
SELECT pg_stat_clear_snapshot();
SELECT pg_terminate_backend(pid) from pg_stat_activity where usename = 'gitlab';
alter database gitlabhq with allow_connections = on;Context
StackExchange Database Administrators Q#183982, answer score: 19
Revisions (0)
No revisions yet.