patternsqlModerate
Compress PostgreSQL Database
Viewed 0 times
postgresqldatabasecompress
Problem
I have a large PostgreSQL database over 500GB in size which is too large. Is there anyway to compress the database down to a more manageable size? I have attempted to do this with SquashFS and the database compressed down to 177GB however PostgreSQL requires that the database have write access and Squashed systems are Read-Only. Do more experienced database users have any suggestions to accomplish this goal?
The database holds GIS data for the planet and will be used locally on a deployed system. Currently it sits on a 1TB SSD, however, I am trying to avoid slapping in an additional hard drive simply to accommodate a large database. The database performs as desired with no issue, I would simply like to compress it down to a more manageable size and avoid placing it on a separate drive.
The database holds GIS data for the planet and will be used locally on a deployed system. Currently it sits on a 1TB SSD, however, I am trying to avoid slapping in an additional hard drive simply to accommodate a large database. The database performs as desired with no issue, I would simply like to compress it down to a more manageable size and avoid placing it on a separate drive.
Solution
File system
A very popular method of doing this is with the file system. BTRFS and ZFS works under the database at the file system level. Both can be used in a loopback device so you can provide a compressed tablespace without having another partition. There are caveats with this, if the tablespace fails it may take your cluster too.
ZFS
ZFS is the big one here. It's what I would go for.
See also
Btrfs
Btrfs is a strong contender but it's been in active development for a very long time, and the lack of major distros picking it up as a default has many people questioning whether or not it's ready for "prime time."
PostgreSQL
GIS Methods (PostGIS)
-
For Polygons, one method is to simplify polygons by dropping vertices with
-
For Points, one method is spatial clustering.
Both of these result in a loss of information. PostGIS, like most of the features of the database, doesn't have a transparent "magic compression" option.
There is also
A very popular method of doing this is with the file system. BTRFS and ZFS works under the database at the file system level. Both can be used in a loopback device so you can provide a compressed tablespace without having another partition. There are caveats with this, if the tablespace fails it may take your cluster too.
ZFS
ZFS is the big one here. It's what I would go for.
See also
- https://blog.2ndquadrant.com/pg-phriday-postgres-zfs/
- https://blogs.oracle.com/observatory/zfs-compression-a-win-win (popular on Oracle too)
Btrfs
Btrfs is a strong contender but it's been in active development for a very long time, and the lack of major distros picking it up as a default has many people questioning whether or not it's ready for "prime time."
- https://btrfs.wiki.kernel.org/index.php/Compression
PostgreSQL
GIS Methods (PostGIS)
-
For Polygons, one method is to simplify polygons by dropping vertices with
ST_Simplify.-
For Points, one method is spatial clustering.
Both of these result in a loss of information. PostGIS, like most of the features of the database, doesn't have a transparent "magic compression" option.
cstore_fdwThere is also
cstore_fdw which is a columnar store that offers compression. It has a different performance profile so ymmv.Context
StackExchange Database Administrators Q#197121, answer score: 12
Revisions (0)
No revisions yet.