patternsqlMinor
PostgreSQL minimize data dir for Docker image
Viewed 0 times
postgresqlminimizeimagedockerdirfordata
Problem
I am preparing a Postgres Docker image based on the official image. I have an application that fills the database with initial data (this takes about an hour for ~300MB of data).
I want to make an image out of that Postgres database so that the data is already initialized.
This works fine, but due to the nature of the WAL, the PG_DATA directory is larger than necessary. For Docker image distribution, minifying PG_DATA is important.
This will add up if I later think about adding a couple more layers of data, which all add their own set of WAL files that are not strictly necessary.
What options do I have to minimize PG_DATA?
My main question is about minimizing the WAL, at the end I talk about minimizing other areas of PG_DATA.
Delete the WAL
When is it safe for me to delete the WAL?
To be clear, this is a throw-away container filled with some initial data. It’s not intended to be a productive system.
I am trying to keep the WAL small by the following config:
However, even after issuing
Why is Postgres not deleting the files? Based on the settings, I would assume that only 32MB should be kept.
After
From what I can tell from other questions and answers, Postgres should automatically delete the WAL files.
Some Q&As I am already aware, but do not talk about my specific problem and use-case:
I want to make an image out of that Postgres database so that the data is already initialized.
This works fine, but due to the nature of the WAL, the PG_DATA directory is larger than necessary. For Docker image distribution, minifying PG_DATA is important.
This will add up if I later think about adding a couple more layers of data, which all add their own set of WAL files that are not strictly necessary.
What options do I have to minimize PG_DATA?
My main question is about minimizing the WAL, at the end I talk about minimizing other areas of PG_DATA.
Delete the WAL
When is it safe for me to delete the WAL?
To be clear, this is a throw-away container filled with some initial data. It’s not intended to be a productive system.
archive_mode = off, wal_level = minimal, no replication.I am trying to keep the WAL small by the following config:
wal_keep_size = 0
wal_recycle = off
min_wal_size = 32MB
max_wal_senders = 0 # necessary due to wal_level = minimalHowever, even after issuing
CHECKPOINT;, the pg_wal directory is still ~200MB big.Why is Postgres not deleting the files? Based on the settings, I would assume that only 32MB should be kept.
After
CHECKPOINT; and stopping the server, is it safe to delete all files from pg_wal?From what I can tell from other questions and answers, Postgres should automatically delete the WAL files.
Some Q&As I am already aware, but do not talk about my specific problem and use-case:
- How can I solve postgresql problem after deleting wal files?
- https://stackoverflow.com/questions/49650016/how-to-reduce-wal-file-count-in-edb-postgresql-9-6-instance
- https://stackoverflow.com/questions/35144403/which-postgresql-wal-files-can-i-safely-remove-from-the-wal-archive-folder
- https://stackoverflow.com/questions/49539938/postgres-wal-
Solution
After you have populated your database, shut it down cleanly (important!) and run
That will truncate your WAL. Never manually delete WAL files.
pg_resetwal -D /path/to/data/directory
That will truncate your WAL. Never manually delete WAL files.
Context
StackExchange Database Administrators Q#307161, answer score: 4
Revisions (0)
No revisions yet.