principlesqlMinor
Is there a memory overhead using partitions in postgres vs. a single table?
Viewed 0 times
postgrespartitionssingleusingmemoryoverheadtheretable
Problem
Background
I am ingesting time series data totalling to about 10M rows per day, where each row has a timestamp, a player id and some other columns.
That data is then queried via an API for a specific player and a time range (e.g. for last 90 days for player
I am running postgres 9.6.17, machine has 500GB hdd space (with about 15% available space at all time), 8 cores and 16GB of ram.
The API is 20 python gunicorn workers running flask and sqlalchemy+psycopg2 on a separate machine. Each worker has a pool of 2 connections to the DB allowing for an overflow of 5. Pool settings used to be higher but it turned out there is almost no benefit to using pools, hence the low numbers.
Naive approach
Initially, I put all the data into a single table and indexing both on timestamp and player. This approach worked fine until the amount of data started to make it slower and slower (for obvious reasons). This lead the API workers to timeout and return 500. A cost of a typical query (single player data over 6 months) as returned by
Better approach with partitions
As an improvement I started storing data in partitions, one per day and then creating the player id and timestamp index on every partition instead.
I am ingesting time series data totalling to about 10M rows per day, where each row has a timestamp, a player id and some other columns.
That data is then queried via an API for a specific player and a time range (e.g. for last 90 days for player
xxx), most cases require more than 1 concurrent request (normally around ~10 concurrent requests are executed)I am running postgres 9.6.17, machine has 500GB hdd space (with about 15% available space at all time), 8 cores and 16GB of ram.
work_mem is set to 2GB, cache_size to 12GB, max connections is set to 100 etc.The API is 20 python gunicorn workers running flask and sqlalchemy+psycopg2 on a separate machine. Each worker has a pool of 2 connections to the DB allowing for an overflow of 5. Pool settings used to be higher but it turned out there is almost no benefit to using pools, hence the low numbers.
Naive approach
Initially, I put all the data into a single table and indexing both on timestamp and player. This approach worked fine until the amount of data started to make it slower and slower (for obvious reasons). This lead the API workers to timeout and return 500. A cost of a typical query (single player data over 6 months) as returned by
explain would be around 1M, example belowBitmap Heap Scan on player_data (cost=515553.98..585514.47 rows=80037 width=32)
Recheck Cond: (((player_id)::text = 'xxx'::text) AND (ts >= 1572566400) AND (ts BitmapAnd (cost=515553.98..515553.98 rows=62819 width=0)
-> Bitmap Index Scan on idx_player_id (cost=0.00..12749.35 rows=962837 width=0)
Index Cond: ((player_id)::text = 'xxx'::text)
-> Bitmap Index Scan on idx_ts (cost=0.00..502778.48 rows=37691480 width=0)
Index Cond: ((ts >= 1572566400) AND (ts < 1574899199))Better approach with partitions
As an improvement I started storing data in partitions, one per day and then creating the player id and timestamp index on every partition instead.
Solution
(normally around ~10 concurrent requests are executed)
... 16GB of ram. work_mem is set to 2GB,
Your work_mem setting seems nuts to me. With 10 concurrent queries and each one might use multiple instances of work_mem (especially common with partitioning--so to answer your title question, yes there is), it would be no surprise to run out of memory.
I've checked the resources on the DB machine (since it's supposed to run out of memory) while querying:
How did you do this? The data from different tools need to be interpreted in different ways.
Also, I've tried the following changes to the DB:
increase work_mem from 20MB to 2GB, no improvement
Right, increasing the amount of memory you use is unlikely to fix an out of memory problem. "The beatings will continue until morale improves."
But are you saying you had the exact same problem back when work_mem was set to only 20MB? Did you look in the database log file to see what it said about the problem directly (as opposed to what python passed along to you about it?)
From your description, it sounds like you might not need partitioning at all. Just a multicolumn index on
Do you ever plan to delete/archive old data, or will in accumulate indefinitely?
... 16GB of ram. work_mem is set to 2GB,
Your work_mem setting seems nuts to me. With 10 concurrent queries and each one might use multiple instances of work_mem (especially common with partitioning--so to answer your title question, yes there is), it would be no surprise to run out of memory.
I've checked the resources on the DB machine (since it's supposed to run out of memory) while querying:
How did you do this? The data from different tools need to be interpreted in different ways.
Also, I've tried the following changes to the DB:
increase work_mem from 20MB to 2GB, no improvement
Right, increasing the amount of memory you use is unlikely to fix an out of memory problem. "The beatings will continue until morale improves."
But are you saying you had the exact same problem back when work_mem was set to only 20MB? Did you look in the database log file to see what it said about the problem directly (as opposed to what python passed along to you about it?)
From your description, it sounds like you might not need partitioning at all. Just a multicolumn index on
(player_id, ts) would have likely fixed your problem without taking on the burden and overhead of partitioning.Do you ever plan to delete/archive old data, or will in accumulate indefinitely?
Context
StackExchange Database Administrators Q#262797, answer score: 3
Revisions (0)
No revisions yet.