HiveBrain v1.2.0
Get Started
← Back to all entries
snippetsqlMinor

How to speed up querying last values in a time series?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
lasttimequeryinghowvaluesseriesspeed

Problem

I have a time series table prices in a PostgreSQL 10 DB.

Here is a simplified test case to illustrate the problem:

CREATE TABLE prices (
    currency text NOT NULL,
    side     boolean NOT NULL,
    price    numeric NOT NULL,
    ts       timestamptz NOT NULL
);


I want to quickly query the last values of each currency/side duo, as this would give me the current buy/sell price of each currency.

My current solution is:

create index on prices (currency, side, ts desc);

select distinct on (currency, side) *
 order by currency, side, ts desc;


But this will give me very slow queries (~500ms) in this table with only ~30k rows.

The actual table has four columns that I want to group, instead of two. Here is what the actual table and query really looks like:

create table prices (
    exchange integer not null,
    pair text not null,
    side boolean not null,
    guaranteed_volume numeric not null,
    ts timestamp with time zone not null,
    price numeric not null,
    constraint prices_pkey primary key (exchange, pair, side, guaranteed_volume, ts),
    constraint prices_exchange_fkey foreign key (exchange)
        references exchanges (id) match simple
        on update no action
        on delete no action
);

create index prices_exchange_pair_side_guaranteed_volume_ts_idx
      on prices (exchange, pair, side, guaranteed_volume, ts desc);

create view last_prices as
select distinct on (exchange, pair, side, guaranteed_volume)
       exchange
     , pair
     , side
     , guaranteed_volume
     , price
     , ts
  from prices
 order by exchange
        , pair
        , side
        , guaranteed_volume
        , ts desc;


There are 34441 rows, currently. Some useful debug queries:

```
# explain (analyze,buffers) select * from last_prices;
QUERY PLAN
--------------------------------------------------------------------------

Solution

I want to quickly query the last values of each currency/side duo

DISTINCT ON excels for few rows per combination of interest. But your use case obviously has many rows per distinct (currency, side). So DISTINCT ON is a bad choice as far as performance is concerned. You'll find a detailed assessment and an arsenal of solutions in these two related answers on SO:

  • Select first row in each GROUP BY group?



  • Optimize GROUP BY query to retrieve latest record per user



If all you need is the latest timestamp ts, the column is sort criteria and desired return value in one and the case is very simple. Look to Evan's simple solution with max(ts).

(Well, ideally, you'd have an index on (currency, side, ts desc NULLS LAST), since max(ts) ignores NULL values and better matches this sort order. But that won't matter much with a column defined NOT NULL.)

Typically, you need additional columns from each selected row (like the current price!) and/or you need to sort by multiple columns, so you need to do more.

Ideally, you have another table listing all currencies - and a FK constraint to enforce referential integrity and disallow nonexistent currencies. Then use the query technique from chapter "2a. LATERAL join" in the linked answer, expanded to account for the added side:

Based on your initial simple test case:

SELECT c.currency, s.side, p.*
FROM   currency c
CROSS  JOIN (VALUES (true), (false)) s(side)  -- account for side
CROSS  JOIN LATERAL (
   SELECT ts, price              -- more columns?
   FROM   prices
   WHERE  currency = c.currency
   AND    side = s.side
   ORDER  BY ts DESC             -- ts is NOT NULL
   LIMIT  1
   ) p
ORDER  BY 1, 2;  -- optional, whatever you prefer;


You should see very fast index scans on an index on (currency, side, ts DESC).

If index-only scans are possible and you only need ts and price it might pay to add price as last column to the index.

dbfiddle here

Whether you save this query in a VIEW or not doesn't affect performance.

Code Snippets

SELECT c.currency, s.side, p.*
FROM   currency c
CROSS  JOIN (VALUES (true), (false)) s(side)  -- account for side
CROSS  JOIN LATERAL (
   SELECT ts, price              -- more columns?
   FROM   prices
   WHERE  currency = c.currency
   AND    side = s.side
   ORDER  BY ts DESC             -- ts is NOT NULL
   LIMIT  1
   ) p
ORDER  BY 1, 2;  -- optional, whatever you prefer;

Context

StackExchange Database Administrators Q#202248, answer score: 5

Revisions (0)

No revisions yet.