patternsqlMinor
Combine two event tables for multiple timelines into a single result set
Viewed 0 times
resulttablescombineintotwosingleformultipletimelinesset
Problem
This question is an extension to a question I've previously asked that was overly simplified. The more accurate example is demonstrated in this SQLFiddle, where I demonstrate a working (but slow) solution, followed by my attempt to adapt the previous answer to the actual problem.
The actual problem comes because the two tables contain events for multiple timelines.
Each table has events for timelines 'A' and 'B'. The goal is to combined the results in to a single result set showing the "state" of each timeline. The two timelines are orthogonal.
ts id foo bar
1 A Lorem adipiscing
5 A sit adipiscing
6 A sit sed
8 A amet sed
1 B ipsum (null)
4 B dolor (null)
5 B dolor elit
8 B consectetur elit
9 B consectetur do
The actual problem comes because the two tables contain events for multiple timelines.
CREATE TABLE foo (ts int, id text, foo text);
INSERT INTO foo (ts, id, foo)
VALUES
(1, 'A', 'Lorem'),
(1, 'B', 'ipsum'),
(4, 'B', 'dolor'),
(5, 'A', 'sit'),
(8, 'A', 'amet'),
(8, 'B', 'consectetur');
CREATE TABLE bar (ts int, id text, bar text);
INSERT INTO bar (ts, id, bar)
VALUES
(1, 'A', 'adipiscing'),
(5, 'B', 'elit'),
(6, 'A', 'sed'),
(9, 'B', 'do ');Each table has events for timelines 'A' and 'B'. The goal is to combined the results in to a single result set showing the "state" of each timeline. The two timelines are orthogonal.
ts id foo bar
1 A Lorem adipiscing
5 A sit adipiscing
6 A sit sed
8 A amet sed
1 B ipsum (null)
4 B dolor (null)
5 B dolor elit
8 B consectetur elit
9 B consectetur do
Solution
In addition to the solution of the simple case, add a
Result as requested (except with
SQL Fiddle
Your attempt to adapt the previous solution was very close. It didn't work because of
If performance is your paramount requirement consider a server-side function like demonstrated in the previous answer.
PARTITION clause to the window functions in the inner query, to get group numbers per partition (per "timeline"). Combine group numbers with the respective timeline (id in your example) keep partitions separate in the second step:SELECT id, ts
, min(foo) OVER (PARTITION BY id, foo_grp) AS foo
, min(bar) OVER (PARTITION BY id, bar_grp) AS bar
FROM (
SELECT id, ts, f.foo, b.bar
, count(f.foo) OVER (PARTITION BY id ORDER BY ts) AS foo_grp
, count(b.bar) OVER (PARTITION BY id ORDER BY ts) AS bar_grp
FROM foo f
FULL JOIN bar b USING (id, ts)
) sub
ORDER BY 1, 2;Result as requested (except with
id first).SQL Fiddle
Your attempt to adapt the previous solution was very close. It didn't work because of
PARTITION BY f.id / PARTITION BY b.id instead of PARTITION BY id. You really want the combined id to include missing rows in the result - that's where the last non-null value has to be filled in for the missing (NULL) value.If performance is your paramount requirement consider a server-side function like demonstrated in the previous answer.
Code Snippets
SELECT id, ts
, min(foo) OVER (PARTITION BY id, foo_grp) AS foo
, min(bar) OVER (PARTITION BY id, bar_grp) AS bar
FROM (
SELECT id, ts, f.foo, b.bar
, count(f.foo) OVER (PARTITION BY id ORDER BY ts) AS foo_grp
, count(b.bar) OVER (PARTITION BY id ORDER BY ts) AS bar_grp
FROM foo f
FULL JOIN bar b USING (id, ts)
) sub
ORDER BY 1, 2;Context
StackExchange Database Administrators Q#106514, answer score: 4
Revisions (0)
No revisions yet.