snippetsqlMinor
How to `vacuumlo` an RDS PostgreSQL database?
Viewed 0 times
postgresqldatabasehowrdsvacuumlo
Problem
As you may have noticed, as a fully managed data base as a service product, AWS Relational Data base Service (RDS) restricts the execution of userland commands.
I was staging a production database, and got my
How can I run
I was staging a production database, and got my
pg_largeobject table bloated to 40% of the whole persistency virtual device capacity.How can I run
vacuumlo (beautifully well explained in other DBA.SE question here) on a RDS instance running PostgreSQL database?Solution
Find here (in the GitHub mirror of the official PostgreSQL repo) sources of
You can imagine the rest: I just mimic what the program is performing, also simply described here.
Prepare the temporary table of object references or OIDs.
1.1. Temporary lob table as copy of full lob table.
1.2. Limit temporary lob table.
Perform the query that returns you all your data base columns typed OID:
Next you have to execute this query for all results fetched by the earlier query (note
In my case it was only two tables totallying five columns, and actually both tables empty, so naturally the five
Finally the program declares a cursor that iterates the remaining
2.A. Do not do it this way.
I should have automated that with a PLPGSQL stored procedure, but since I suck at that kind of tasks, I just issued this one liner:
Then this other that iterates selecting the first orphan OID in the temporary table of orphan OIDs then unlinking it and removing it from the table:
I knew it was suboptimized as hell, but I let it slowly removing those orphan records. When not being DBA at all those one liners can be easier to forge than working some meaningful idiomatic PLPGSQL.
But this was too slow to leave it like this.
2.B. Do this better than 2.A (though not a silver bullet yet).
You will be able to speed up large object unlinking with something simple such as:
%%CODEBLOCK_5%%
NOTE it is not required to unlink large objects
Which is somewhat easy to forge for a non DBA person, then calling it with something like
%%CODEBLOCK_6%%
Where
vacuumlo for the PostgreSQL data base version I was using at that very moment.You can imagine the rest: I just mimic what the program is performing, also simply described here.
- Temporary table construction.
Prepare the temporary table of object references or OIDs.
1.1. Temporary lob table as copy of full lob table.
=> SET search_path = pg_catalog;
[...]
=> CREATE TABLE vacuum_lo_removeme AS \
SELECT oid AS lo FROM pg_largeobject_metadata;
[...]
=> ANALYZE vacuum_lo_removeme;
[...]
=> _1.2. Limit temporary lob table.
Perform the query that returns you all your data base columns typed OID:
=> SELECT
s.nspname, c.relname, a.attname FROM pg_class c, pg_attribute a
, pg_namespace s, pg_type t
WHERE
a.attnum > 0 AND NOT a.attisdropped AND a.attrelid = c.oid
AND a.atttypid = t.oid AND c.relnamespace = s.oid
AND t.typname in ('oid', 'lo') AND c.relkind in ('r','m')
AND s.nspname !~ '^pg_';Next you have to execute this query for all results fetched by the earlier query (note
${VARIABLE} is something you should substitute yourself according to your mileage) in order to remove from the temporary table all OIDs of objects actually in use:=> DELETE FROM vacuum_lo_removeme WHERE \
lo IN (SELECT ${column} FROM ${SCHEMA}.${TABLE});In my case it was only two tables totallying five columns, and actually both tables empty, so naturally the five
DELETE queries did not do a thing. If you have a bigger OID enabled subschema you might need to automate this somehow.- Large objects unlinking.
Finally the program declares a cursor that iterates the remaining
lo cells of the temporary table, purging them with a lo_unlink function call.2.A. Do not do it this way.
I should have automated that with a PLPGSQL stored procedure, but since I suck at that kind of tasks, I just issued this one liner:
$ echo 'SELECT COUNT(*) FROM vacuum_lo_removeme;' | \
$MY_AUTOMATABLE_PSQL_MILEAGE
count
---------
1117233
(1 row)Then this other that iterates selecting the first orphan OID in the temporary table of orphan OIDs then unlinking it and removing it from the table:
$ for i in {1..1117000}; do \
export oid=$(echo 'SELECT * FROM vacuum_lo_removeme LIMIT 1' | \
$MY_AUTOMATABLE_PSQL_MILEAGE | grep -v 'lo\|\-\-\-\-\|row\|^
I knew it was suboptimized as hell, but I let it slowly removing those orphan records. When not being DBA at all those one liners can be easier to forge than working some meaningful idiomatic PLPGSQL.
But this was too slow to leave it like this.
2.B. Do this better than 2.A (though not a silver bullet yet).
You will be able to speed up large object unlinking with something simple such as:
=> CREATE OR REPLACE FUNCTION unlink_orphan_los() RETURNS VOID AS $
DECLARE
iterator integer := 0;
largeoid OID;
myportal CURSOR FOR SELECT lo FROM vacuum_lo_removeme;
BEGIN
OPEN myportal;
LOOP
FETCH myportal INTO largeoid;
EXIT WHEN NOT FOUND;
PERFORM lo_unlink(largeoid);
DELETE FROM vacuum_lo_removeme WHERE lo = largeoid;
iterator := iterator + 1;
RAISE NOTICE '(%) removed lo %?', iterator, largeoid;
IF iterator = 100 THEN EXIT; END IF;
END LOOP;
END;$LANGUAGE plpgsql;
NOTE it is not required to unlink large objects 100 at a time, not even a particular number x of them at a time, but unlinking 100 at a time is the safest base applicable to all AWS instance sizes' default memory configuration. If you use an excessively large number for this, you risk function failure because of insufficiently assigned memory; how you can do will depend on the amount of objects to unlink and their size, on instance type and size, and on degree of manual further configuration applied.
Which is somewhat easy to forge for a non DBA person, then calling it with something like
$ for i in {0..$WHATEVER}; do echo 'SELECT unlink_orphan_los()' | \
$YOUR_AUTOMATABLE_PSQL_MILEAGE
Where ${WHATEVER} is a constructed constant depending on the temporary large objects' table size and the number of locks per transaction your configuration is allowing (I am using RDS defaults but iterating from bash I guess | \
sed s/\ //g) && \
echo "SELECT lo_unlink($oid); \
DELETE FROM vacuum_lo_removeme WHERE lo = $oid" | \
$MY_AUTOMATABLE_PSQL_MILEAGE; \
done
lo_unlink
-----------
1
(1 row)
DELETE 1
lo_unlink
-----------
1
(1 row)
DELETE 1
lo_unlink
-----------
1
(1 row)
DELETE 1
lo_unlink
-----------
1
(1 row)
DELETE 1
ERROR: must be owner of large object 18448
DELETE 1
ERROR: must be owner of large object 18449
DELETE 1
ERROR: must be owner of large object 18450
DELETE 1
ERROR: must be owner of large object 18451
[...]
lo_unlink
-----------
1
(1 row)
DELETE 1
[...]I knew it was suboptimized as hell, but I let it slowly removing those orphan records. When not being DBA at all those one liners can be easier to forge than working some meaningful idiomatic PLPGSQL.
But this was too slow to leave it like this.
2.B. Do this better than 2.A (though not a silver bullet yet).
You will be able to speed up large object unlinking with something simple such as:
%%CODEBLOCK_5%%
NOTE it is not required to unlink large objects
100 at a time, not even a particular number x of them at a time, but unlinking 100 at a time is the safest base applicable to all AWS instance sizes' default memory configuration. If you use an excessively large number for this, you risk function failure because of insufficiently assigned memory; how you can do will depend on the amount of objects to unlink and their size, on instance type and size, and on degree of manual further configuration applied.Which is somewhat easy to forge for a non DBA person, then calling it with something like
%%CODEBLOCK_6%%
Where
${WHATEVER} is a constructed constant depending on the temporary large objects' table size and the number of locks per transaction your configuration is allowing (I am using RDS defaults but iterating from bash I guessCode Snippets
=> SET search_path = pg_catalog;
[...]
=> CREATE TABLE vacuum_lo_removeme AS \
SELECT oid AS lo FROM pg_largeobject_metadata;
[...]
=> ANALYZE vacuum_lo_removeme;
[...]
=> _=> SELECT
s.nspname, c.relname, a.attname FROM pg_class c, pg_attribute a
, pg_namespace s, pg_type t
WHERE
a.attnum > 0 AND NOT a.attisdropped AND a.attrelid = c.oid
AND a.atttypid = t.oid AND c.relnamespace = s.oid
AND t.typname in ('oid', 'lo') AND c.relkind in ('r','m')
AND s.nspname !~ '^pg_';=> DELETE FROM vacuum_lo_removeme WHERE \
lo IN (SELECT ${column} FROM ${SCHEMA}.${TABLE});$ echo 'SELECT COUNT(*) FROM vacuum_lo_removeme;' | \
$MY_AUTOMATABLE_PSQL_MILEAGE
count
---------
1117233
(1 row)$ for i in {1..1117000}; do \
export oid=$(echo 'SELECT * FROM vacuum_lo_removeme LIMIT 1' | \
$MY_AUTOMATABLE_PSQL_MILEAGE | grep -v 'lo\|\-\-\-\-\|row\|^$' | \
sed s/\ //g) && \
echo "SELECT lo_unlink($oid); \
DELETE FROM vacuum_lo_removeme WHERE lo = $oid" | \
$MY_AUTOMATABLE_PSQL_MILEAGE; \
done
lo_unlink
-----------
1
(1 row)
DELETE 1
lo_unlink
-----------
1
(1 row)
DELETE 1
lo_unlink
-----------
1
(1 row)
DELETE 1
lo_unlink
-----------
1
(1 row)
DELETE 1
ERROR: must be owner of large object 18448
DELETE 1
ERROR: must be owner of large object 18449
DELETE 1
ERROR: must be owner of large object 18450
DELETE 1
ERROR: must be owner of large object 18451
[...]
lo_unlink
-----------
1
(1 row)
DELETE 1
[...]Context
StackExchange Database Administrators Q#174663, answer score: 5
Revisions (0)
No revisions yet.