HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

PostgreSQL minimize data dir for Docker image

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgresqlminimizeimagedockerdirfordata

Problem

I am preparing a Postgres Docker image based on the official image. I have an application that fills the database with initial data (this takes about an hour for ~300MB of data).
I want to make an image out of that Postgres database so that the data is already initialized.
This works fine, but due to the nature of the WAL, the PG_DATA directory is larger than necessary. For Docker image distribution, minifying PG_DATA is important.
This will add up if I later think about adding a couple more layers of data, which all add their own set of WAL files that are not strictly necessary.

What options do I have to minimize PG_DATA?
My main question is about minimizing the WAL, at the end I talk about minimizing other areas of PG_DATA.
Delete the WAL

When is it safe for me to delete the WAL?

To be clear, this is a throw-away container filled with some initial data. It’s not intended to be a productive system. archive_mode = off, wal_level = minimal, no replication.

I am trying to keep the WAL small by the following config:

wal_keep_size = 0
wal_recycle = off
min_wal_size = 32MB
max_wal_senders = 0 # necessary due to wal_level = minimal


However, even after issuing CHECKPOINT;, the pg_wal directory is still ~200MB big.
Why is Postgres not deleting the files? Based on the settings, I would assume that only 32MB should be kept.
After CHECKPOINT; and stopping the server, is it safe to delete all files from pg_wal?

From what I can tell from other questions and answers, Postgres should automatically delete the WAL files.
Some Q&As I am already aware, but do not talk about my specific problem and use-case:

  • How can I solve postgresql problem after deleting wal files?



  • https://stackoverflow.com/questions/49650016/how-to-reduce-wal-file-count-in-edb-postgresql-9-6-instance



  • https://stackoverflow.com/questions/35144403/which-postgresql-wal-files-can-i-safely-remove-from-the-wal-archive-folder



  • https://stackoverflow.com/questions/49539938/postgres-wal-

Solution

After you have populated your database, shut it down cleanly (important!) and run
pg_resetwal -D /path/to/data/directory


That will truncate your WAL. Never manually delete WAL files.

Context

StackExchange Database Administrators Q#307161, answer score: 4

Revisions (0)

No revisions yet.