patternsqlMinor
Efficiently store time based data
Viewed 0 times
efficientlytimestorebaseddata
Problem
As a reporting solution, I have a large table with ~50M records called "reports". Currently, I am using PostgreSQL with partitioning table feature day by day, (table name
An example query (runs 41 seconds)
https://gist.github.com/onesvat/be234fbcb6c4d375f9d1dd4151d69391
If the interval is small, there is no problem but when I use larger intervals, it becomes too slow (20 or more seconds).
The system has 128GB Ram, 16 threads and 4 SSD disk with raid0.
The system never deletes or updates yesterday and backward tables so there is no need the re-query older datas. Is there any database or extension that can handle these type of queries in reasonable time?
There is indexes on most columns that includes date and something, in fact, there is additional indexes like:
also constraint :
reports_20170101 means that all the records of 2017-01-01 stores in there). An example query (runs 41 seconds)
SELECT to_char(date_trunc('week', rpt_datetime), 'YYYY-WW') date_week,
SUM(rpt_revenue) revenue FROM reports
WHERE rpt_datetime < ?
GROUP BY date_weekEXPLAIN, ANALYZE result:https://gist.github.com/onesvat/be234fbcb6c4d375f9d1dd4151d69391
If the interval is small, there is no problem but when I use larger intervals, it becomes too slow (20 or more seconds).
The system has 128GB Ram, 16 threads and 4 SSD disk with raid0.
The system never deletes or updates yesterday and backward tables so there is no need the re-query older datas. Is there any database or extension that can handle these type of queries in reasonable time?
There is indexes on most columns that includes date and something, in fact, there is additional indexes like:
btree (date_trunc('week'::text, timezone('Europe/Istanbul'::text, rpt_datetime)))also constraint :
"reports_20170101_rpt_datetime_check" CHECK (rpt_datetime >= '2017-01-01 00:00:00+00'::timestamp with time zone AND rpt_datetime < '2017-01-02 00:00:00+00'::timestamp with time zone)Solution
You might look into using materialized views, if the partitions become effectively read-only on a daily basis. You can pre-calculate the results of these queries for the older data in the materialized views.
Context
StackExchange Database Administrators Q#168872, answer score: 4
Revisions (0)
No revisions yet.