patternsqlMinor
Very large btree index with few rows (openstreetmap gis data)
Viewed 0 times
rowswithgisbtreelargefewopenstreetmapveryindexdata
Problem
I tried to import a subset of open street map data to a Postgres database using the standard tool (osm2pgsql). I then tried to rebuild the indices on the table using plain sql (dropping them, then recreating them) and they end up taking massive amount of space.
The table size is 9GB, with an int
I end up with an index that's 1GB in size for 140k rows, which is massive considering how little data there is. This doesn't seem to happen on other systems, and I tried to
I don't know much about Postgres, but any hints about how to reduce the index size would be very appreciated.
(OS is ubuntu 12-04, PostgreSQL is version 9.1)
The table size is 9GB, with an int
id, 70 text columns for attributes (for example column access_type has 250 different text values, and is null in 90% of cases), and a possibly large geometry column. If I try:CREATE INDEX planet_osm_polygon_accesstype_index
ON planet_osm_polygon
USING BTREE (access_type)I end up with an index that's 1GB in size for 140k rows, which is massive considering how little data there is. This doesn't seem to happen on other systems, and I tried to
ANALYZE, then VACUUM.I don't know much about Postgres, but any hints about how to reduce the index size would be very appreciated.
(OS is ubuntu 12-04, PostgreSQL is version 9.1)
Solution
You seem to expect that rows with
that's hardly useful in your case. Such common values hardly ever make sense in an index to begin with, be it
Should be much smaller and thus faster.
Remember that you may have to include the same
Related answers with more details:
NULL values are excluded from a B-tree index automatically, but that's not the case. Those are indexed as well and can be searched for. However, since:access_type ... is null in 90% of casesthat's hardly useful in your case. Such common values hardly ever make sense in an index to begin with, be it
NULL or any other value. Exclude NULL values from the index with a partial index.CREATE INDEX planet_osm_polygon_accesstype_index ON planet_osm_polygon (access_type)
WHERE access_type IS NOT NULL;Should be much smaller and thus faster.
Remember that you may have to include the same
WHERE condition in queries to make Postgres realize it can apply the partial index.Related answers with more details:
- Unexpected Seq Scan when doing query against boolean with value NULL
- Join table on some conditions
Code Snippets
CREATE INDEX planet_osm_polygon_accesstype_index ON planet_osm_polygon (access_type)
WHERE access_type IS NOT NULL;Context
StackExchange Database Administrators Q#71909, answer score: 4
Revisions (0)
No revisions yet.