patternsqlMinor
Merging two tables by minimal interval values
Viewed 0 times
tablesvaluesmergingintervaltwominimal
Problem
I have two tables with two columns keeping tabs of categorical values, such as:
These Begin and End columns represent a continuous interval in which the value is valid. In the above example, the interval would be days in a month, so according to Table 1 I'd have a normal condition from day 1 to day 8, critical from day 8 to day 23, and normal again from the 23rd to the 30th. And I know from Table 2 that John was supervising from the 1st to the 14th, and Janice took over from the 14th to the 30th.
What I want is to merge these two tables, to have both Condition and Supervisor values in the same table, with the minimal interval for each pairing of values. So, this:
What can be guaranteed of the tables is that:
I had this worked out programatically in
Table 1
+----+-------+-----+-----------+
| ID | Begin | End | Condition |
+----+-------+-----+-----------+
| 1 | 1 | 8 | Normal |
| 2 | 8 | 23 | Critical |
| 3 | 23 | 30 | Normal |
+----+-------+-----+-----------+
Table 2
+----+-------+-----+------------+
| ID | Begin | End | Supervisor |
+----+-------+-----+------------+
| 1 | 1 | 14 | John |
| 2 | 14 | 30 | Janice |
+----+-------+-----+------------+These Begin and End columns represent a continuous interval in which the value is valid. In the above example, the interval would be days in a month, so according to Table 1 I'd have a normal condition from day 1 to day 8, critical from day 8 to day 23, and normal again from the 23rd to the 30th. And I know from Table 2 that John was supervising from the 1st to the 14th, and Janice took over from the 14th to the 30th.
What I want is to merge these two tables, to have both Condition and Supervisor values in the same table, with the minimal interval for each pairing of values. So, this:
Merged Table
+----+-------+-----+-----------+------------+
| ID | Begin | End | Condition | Supervisor |
+----+-------+-----+-----------+------------+
| 1 | 1 | 8 | Normal | John |
| 2 | 8 | 14 | Critical | John |
| 3 | 14 | 23 | Critical | Janice |
| 4 | 23 | 30 | Normal | Janice |
+----+-------+-----+-----------+------------+What can be guaranteed of the tables is that:
- each will have its respective "Begin" and "End" fields.
- those will have the same span (the "Begin" value of the first row and the "End" value of the last row are the same for every table).
- that every interval is sequential (the "End" value of row n is always equal to the "Begin" value of row n+1).
- that the value of "End" will always be higher than "Begin" for a given row.
I had this worked out programatically in
Solution
I'd approach this in three logical stages:
So the solution looks like this:
begin_on | end_on | condition | supervisor
-------: | -----: | :-------- | :---------
1 | 8 | Normal | John
8 | 14 | Critical | John
14 | 23 | Critical | Janice
23 | 30 | Normal | Janice
dbfiddle here
I am assuming this is a cut-down example of your real-world problem or I would also suggest changing the way you store the data in the first place, perhaps using date ranges instead of integers.
- Expand the rows in both tables for all the days in the month and join them together.
- Assign a 'group' to each series of rows where
conditionandsupervisordon't change for a period (a "gaps and islands" problem).
- Group the results.
So the solution looks like this:
create table t1(
id serial primary key
, begin_on integer
, end_on integer
, condition text
);
insert into t1(begin_on,end_on,condition)
values (1,8,'Normal')
, (8,23,'Critical')
, (23,30,'Normal');
create table t2(
id serial primary key
, begin_on integer
, end_on integer
, supervisor text
);
insert into t2(begin_on,end_on,supervisor)
values (1,14,'John')
, (14,30,'Janice');select min(g) begin_on, max(g)+1 end_on, condition, supervisor
from( select g
, condition
, supervisor
, row_number() over (order by g)
- row_number() over (partition by t1.id, t2.id order by g) grp
from generate_series(1,30) g
join t1 on g>=t1.begin_on and g=t2.begin_on and g<t2.end_on ) z
group by condition, supervisor, grp
order by begin_on;begin_on | end_on | condition | supervisor
-------: | -----: | :-------- | :---------
1 | 8 | Normal | John
8 | 14 | Critical | John
14 | 23 | Critical | Janice
23 | 30 | Normal | Janice
dbfiddle here
I am assuming this is a cut-down example of your real-world problem or I would also suggest changing the way you store the data in the first place, perhaps using date ranges instead of integers.
Code Snippets
create table t1(
id serial primary key
, begin_on integer
, end_on integer
, condition text
);
insert into t1(begin_on,end_on,condition)
values (1,8,'Normal')
, (8,23,'Critical')
, (23,30,'Normal');
create table t2(
id serial primary key
, begin_on integer
, end_on integer
, supervisor text
);
insert into t2(begin_on,end_on,supervisor)
values (1,14,'John')
, (14,30,'Janice');select min(g) begin_on, max(g)+1 end_on, condition, supervisor
from( select g
, condition
, supervisor
, row_number() over (order by g)
- row_number() over (partition by t1.id, t2.id order by g) grp
from generate_series(1,30) g
join t1 on g>=t1.begin_on and g<t1.end_on
join t2 on g>=t2.begin_on and g<t2.end_on ) z
group by condition, supervisor, grp
order by begin_on;Context
StackExchange Database Administrators Q#199846, answer score: 3
Revisions (0)
No revisions yet.