snippetsqlMinor
How to speed up querying last values in a time series?
Viewed 0 times
lasttimequeryinghowvaluesseriesspeed
Problem
I have a time series table
Here is a simplified test case to illustrate the problem:
I want to quickly query the last values of each
My current solution is:
But this will give me very slow queries (~500ms) in this table with only ~30k rows.
The actual table has four columns that I want to group, instead of two. Here is what the actual table and query really looks like:
There are 34441 rows, currently. Some useful debug queries:
```
# explain (analyze,buffers) select * from last_prices;
QUERY PLAN
--------------------------------------------------------------------------
prices in a PostgreSQL 10 DB.Here is a simplified test case to illustrate the problem:
CREATE TABLE prices (
currency text NOT NULL,
side boolean NOT NULL,
price numeric NOT NULL,
ts timestamptz NOT NULL
);I want to quickly query the last values of each
currency/side duo, as this would give me the current buy/sell price of each currency.My current solution is:
create index on prices (currency, side, ts desc);
select distinct on (currency, side) *
order by currency, side, ts desc;But this will give me very slow queries (~500ms) in this table with only ~30k rows.
The actual table has four columns that I want to group, instead of two. Here is what the actual table and query really looks like:
create table prices (
exchange integer not null,
pair text not null,
side boolean not null,
guaranteed_volume numeric not null,
ts timestamp with time zone not null,
price numeric not null,
constraint prices_pkey primary key (exchange, pair, side, guaranteed_volume, ts),
constraint prices_exchange_fkey foreign key (exchange)
references exchanges (id) match simple
on update no action
on delete no action
);
create index prices_exchange_pair_side_guaranteed_volume_ts_idx
on prices (exchange, pair, side, guaranteed_volume, ts desc);
create view last_prices as
select distinct on (exchange, pair, side, guaranteed_volume)
exchange
, pair
, side
, guaranteed_volume
, price
, ts
from prices
order by exchange
, pair
, side
, guaranteed_volume
, ts desc;There are 34441 rows, currently. Some useful debug queries:
```
# explain (analyze,buffers) select * from last_prices;
QUERY PLAN
--------------------------------------------------------------------------
Solution
I want to quickly query the last values of each
If all you need is the latest timestamp
(Well, ideally, you'd have an index on
Typically, you need additional columns from each selected row (like the current price!) and/or you need to sort by multiple columns, so you need to do more.
Ideally, you have another table listing all currencies - and a FK constraint to enforce referential integrity and disallow nonexistent currencies. Then use the query technique from chapter "2a. LATERAL join" in the linked answer, expanded to account for the added
Based on your initial simple test case:
You should see very fast index scans on an index on
If index-only scans are possible and you only need
dbfiddle here
Whether you save this query in a
currency/side duoDISTINCT ON excels for few rows per combination of interest. But your use case obviously has many rows per distinct (currency, side). So DISTINCT ON is a bad choice as far as performance is concerned. You'll find a detailed assessment and an arsenal of solutions in these two related answers on SO:- Select first row in each GROUP BY group?
- Optimize GROUP BY query to retrieve latest record per user
If all you need is the latest timestamp
ts, the column is sort criteria and desired return value in one and the case is very simple. Look to Evan's simple solution with max(ts).(Well, ideally, you'd have an index on
(currency, side, ts desc NULLS LAST), since max(ts) ignores NULL values and better matches this sort order. But that won't matter much with a column defined NOT NULL.)Typically, you need additional columns from each selected row (like the current price!) and/or you need to sort by multiple columns, so you need to do more.
Ideally, you have another table listing all currencies - and a FK constraint to enforce referential integrity and disallow nonexistent currencies. Then use the query technique from chapter "2a. LATERAL join" in the linked answer, expanded to account for the added
side:Based on your initial simple test case:
SELECT c.currency, s.side, p.*
FROM currency c
CROSS JOIN (VALUES (true), (false)) s(side) -- account for side
CROSS JOIN LATERAL (
SELECT ts, price -- more columns?
FROM prices
WHERE currency = c.currency
AND side = s.side
ORDER BY ts DESC -- ts is NOT NULL
LIMIT 1
) p
ORDER BY 1, 2; -- optional, whatever you prefer;You should see very fast index scans on an index on
(currency, side, ts DESC).If index-only scans are possible and you only need
ts and price it might pay to add price as last column to the index.dbfiddle here
Whether you save this query in a
VIEW or not doesn't affect performance.Code Snippets
SELECT c.currency, s.side, p.*
FROM currency c
CROSS JOIN (VALUES (true), (false)) s(side) -- account for side
CROSS JOIN LATERAL (
SELECT ts, price -- more columns?
FROM prices
WHERE currency = c.currency
AND side = s.side
ORDER BY ts DESC -- ts is NOT NULL
LIMIT 1
) p
ORDER BY 1, 2; -- optional, whatever you prefer;Context
StackExchange Database Administrators Q#202248, answer score: 5
Revisions (0)
No revisions yet.