patternsqlMinor
Return values at more/less specific time points
Viewed 0 times
returnpointsmoretimelessspecificvalues
Problem
In PostgreSQL 10.0, I have a big table with two columns:
Now, if I have 100 given time points, how do I return exactly 100
For just a single one this would be very easy:
But how to do this efficiently for a 100? Something like:
Obviously, there is an index on
CREATE TABLE xs (
ts timestamp with time zone NOT NULL,
x integer NOT NULL
)Now, if I have 100 given time points, how do I return exactly 100
x-es that are right before (<=) these time points (for each of those points)?For just a single one this would be very easy:
SELECT x FROM xs WHERE ts <= '2018-03-13 11:41:47.167973+00' LIMIT 1But how to do this efficiently for a 100? Something like:
SELECT x FROM xs
WHERE ts FIRST_LESS_THAN_EQUAL_IN
('2018-03-13 11:41:47.167973+00',
'2018-03-13 11:41:47.198564+00',
'2018-03-13 11:41:47.555668+00',
...
'2018-03-13 11:41:57.766888+00')Obviously, there is an index on
ts.Solution
I suggest a
A
There are others ways to pass your values, like an array (or an array literal cast to
About the
Don't forget the
If
An index on
Or, ideally, on
dbfiddle here
LATERAL subquery, best with a LEFT [OUTER] JOIN to preserve all input rows (displaying a NULL value in case there should be no match at all):SELECT v.reference_ts, t.most_recent_x
FROM (
VALUES
(timestamptz '2018-03-13 11:41:47.167973+00') -- type cast in first row
, ('2018-03-13 11:41:47.198564+00')
, ('2018-03-13 11:41:47.555668+00')
, ('2018-03-13 11:41:57.766888+00')
) AS v(reference_ts)
LEFT JOIN LATERAL (
SELECT x AS most_recent_x
FROM xs
WHERE ts <= v.reference_ts -- ts is unique (?!); see below
ORDER BY ts DESC
LIMIT 1
) AS t ON true;A
VALUES expression is one way to put your 100 timestamps into a derived table we can work with. The free-standing VALUES expression may require an explicit type cast in the first row. See:- Casting NULL type when updating multiple rows
There are others ways to pass your values, like an array (or an array literal cast to
timestamptz[]) and unnest() in the query:- Optimizing a Postgres query with a large IN
About the
LATERAL subquery:- What is the difference between LATERAL and a subquery in PostgreSQL?
Don't forget the
ORDER BY or you get arbitrary results - like @a1ex07 commented.If
ts is not unique, it's undefined which of the duplicates is returned, unless you add more ORDER BY items to make it deterministic. Like: ORDER BY ts DESC, x DESC to get the row with the biggest x from any set of peers. And make the index match.An index on
(ts) (like you mentioned) works wonders on performance.Or, ideally, on
(ts DESC, x) if preconditions for index-only scans are met.- Does PostgreSQL use an index-only scan in this JOIN?
dbfiddle here
Code Snippets
SELECT v.reference_ts, t.most_recent_x
FROM (
VALUES
(timestamptz '2018-03-13 11:41:47.167973+00') -- type cast in first row
, ('2018-03-13 11:41:47.198564+00')
, ('2018-03-13 11:41:47.555668+00')
, ('2018-03-13 11:41:57.766888+00')
) AS v(reference_ts)
LEFT JOIN LATERAL (
SELECT x AS most_recent_x
FROM xs
WHERE ts <= v.reference_ts -- ts is unique (?!); see below
ORDER BY ts DESC
LIMIT 1
) AS t ON true;Context
StackExchange Database Administrators Q#200105, answer score: 2
Revisions (0)
No revisions yet.