principlesqlMinor
Is VACUUM FULL necessary on a DROP TABLE / TRUNCATE approach for a datawarehouse
Viewed 0 times
necessaryfullvacuumdroptruncatefordatawarehouseapproachtable
Problem
We are updating a datawarehouse on a weekly basis using a series of
... for data import
and:
for table updates.
We end our update process with a
Here, as it is a share of 100% for all tables, we reasonably thought that
As we see the ouptput of the verbose option, it seems that Postgresql has not much to do, as every table
I guess that on the contrary
Most tables are 10-100m rows big.
But we were wondering if
(Or may be the whole update process (DROP / CREATE TABLE AS) is not the right way to do?)
TRUNCATE source_table1
COPY source_table1 FROM [...]... for data import
and:
DROP TABLE IF EXISTS my_table
CREATE TABLE my_table AS SELECT [...]for table updates.
We end our update process with a
VACUUM FULL [VERBOSE] ANALYZE, because, as the documentation suggests, VACUUM should be done when a significant share of tuples has been updated or removed.Here, as it is a share of 100% for all tables, we reasonably thought that
VACUUM should be applied.As we see the ouptput of the verbose option, it seems that Postgresql has not much to do, as every table
VACUUMed gives: INFO: vacuuming "public.table345"
INFO: "table345": found 0 removable, 9831703 nonremovable row versions in 62538 pages
DETAIL : 0 dead row versions cannot be removed yet.I guess that on the contrary
ANALYZE is more than helpfull to update the internal statistics.Most tables are 10-100m rows big.
But we were wondering if
VACUUM FULL or just VACUUM were really necessary in that case?(Or may be the whole update process (DROP / CREATE TABLE AS) is not the right way to do?)
Solution
Point about TRUNCATE/COPY
Unrelated note, depending on the WAL level, wrapping the
In minimal level, WAL-logging of some bulk operations can be safely skipped, which can make those operations much faster (see Section 14.4.7). Operations in which this optimization can be applied include
source
Issue of
-
There is no need to do a
-
A simple
Unrelated note, depending on the WAL level, wrapping the
TRUNCATE and COPY in the same transaction things may go faster, because WAL will be skipped. Further the CTAS will always skip most WAL.In minimal level, WAL-logging of some bulk operations can be safely skipped, which can make those operations much faster (see Section 14.4.7). Operations in which this optimization can be applied include
source
- CREATE TABLE AS
- CREATE INDEX
- CLUSTER
- COPY into tables that were created or truncated in the same transaction
BEGIN;
TRUNCATE source_table1
COPY source_table1 FROM [...]
COMMIT;Issue of
VACUUM FULL-
There is no need to do a
VACUUM FULL on a new table. The table is already new in those transactions so there is no need to rewrite it, as FULL does. There is also no need to VACUUM it because there are no movable rows that VACUUM will act on. When you run VACUUM FULL VERBOSE, you can see nothing was removable, and no dead rows were removed. VACUUM FULL can be useful if- you're staging
INSERTSthough a temp table, and adding them in batches into a dirty table. Then it can often pay off at the end of the process.
- you're running
UPDATEon a large batch becauseUPDATEgenerates new rows forcingVACUUMto later mark the old rows as dead.
-
A simple
ANALYZE will work fine. This will update the statistics on the table.Code Snippets
BEGIN;
TRUNCATE source_table1
COPY source_table1 FROM [...]
COMMIT;Context
StackExchange Database Administrators Q#162240, answer score: 3
Revisions (0)
No revisions yet.