patternsqlMinor
Sorting multiple columns in a full outer join, possibly containing NULLs
Viewed 0 times
sortingcontainingfullcolumnspossiblyjoinmultiplenullsouter
Problem
I am trying to compare a plan against a history to see if there are steps omitted or if unplanned steps have been included. The regular, planned history items as well as the anomalies are intended for an elaborate report. The problem is, I have a column in each joined table that I need for sorting, but due to the
... and the history to the actual steps while cooking:
Now, I want to compare these tables in a result table with a
```
Table: Result
Sorting | TimeStamp | Position | Message | Comment
--------+-----------+----------+------------------------+---------------
1 | 00:01:00 | NULL | Chef entered kitchen | unplanned
2 | NULL | 1 | Wash vegetables | omitt
full outer join I cannot guarantee them to be not null, which breaks sorting. I'll continue with a simplified example, where the plan corresponds to a recipe:Table: RecipeSteps
Id | RecipeId | Position | Message
----+----------+----------+----------------------------
50 | 123 | 1 | Wash vegetables
51 | 123 | 2 | Peel vegetables
52 | 123 | 3 | Cut meat
53 | 123 | 4 | Turn on stove
54 | 123 | 5 | Cook and stir it
55 | 123 | 6 | Turn off stove
----+----------+----------+----------------------------... and the history to the actual steps while cooking:
Table: History
Id | TimeStamp | Session | StepId | Message
----+-----------+---------+--------+-----------------------------
90 | 00:01:00 | 321 | NULL | Chef entered kitchen
91 | 00:02:00 | 321 | 51 | Chef peeled vegetables
92 | 00:03:00 | 321 | 52 | Chef cut meat
93 | 00:04:00 | 321 | NULL | Chef picked his nose
94 | 00:05:00 | 321 | 53 | Chef turned on stove
95 | 00:06:00 | 321 | 54 | Chef started cooking
96 | 00:10:00 | 321 | NULL | Chef left kitchen
97 | 01:00:00 | 321 | NULL | FIRE!
----+-----------+---------+--------+-----------------------------Now, I want to compare these tables in a result table with a
full outer join similar to this:```
Table: Result
Sorting | TimeStamp | Position | Message | Comment
--------+-----------+----------+------------------------+---------------
1 | 00:01:00 | NULL | Chef entered kitchen | unplanned
2 | NULL | 1 | Wash vegetables | omitt
Solution
Please note: The query below doesn't take into account partitioning by
The main idea is to split the data set into two groups and process them separately.
So, the first group is all rows from
The second group is omitted Steps. Another
If you run the first
One way to do it is to take the last non-null value from the previous rows. Unfortunately, SQL Server doesn't implement
The very first
Once we have values for all Positions, it is trivial to union this result with all omitted steps and order them.
In this approach all omitted steps go after unplanned steps in case of ambiguity, so
Sample data
Query
Result
```
+-------------+---------------------+----------+------------------------+-----------+
| NewPosition | ts | Position | NewMessage | Comment |
+-------------+---------------------+----------+------------------------+-----------+
| 0 | 2017-01-01 00:01:00 | NULL | Chef entered kitchen | unplanned |
| 1 | NULL | 1 | Wash vegetables | omitted |
|
Session to keep it simple.The main idea is to split the data set into two groups and process them separately.
History goes first. I decided to make History more important than Steps. If some steps are swapped, they will be presented in the history order.So, the first group is all rows from
History plus possible details from the corresponding Step. This is a simple LEFT JOIN.The second group is omitted Steps. Another
LEFT JOIN with a filter.If you run the first
CTE_History ordering by timestamp you'll see that unplanned events have NULL Position, but they will be in correct position in the result set, because of their timestamp. All we need to do is fill in these blanks. +-------------+---------------------+----------+------------------------+-----------+
| NewPosition | ts | Position | NewMessage | Comment |
+-------------+---------------------+----------+------------------------+-----------+
| 0 | 2017-01-01 00:01:00 | NULL | Chef entered kitchen | unplanned |
| 2 | 2017-01-01 00:02:00 | 2 | Chef peeled vegetables | planned |
| 3 | 2017-01-01 00:03:00 | 3 | Chef cut meat | planned |
| 3 | 2017-01-01 00:04:00 | NULL | Chef picked his nose | unplanned |
| 4 | 2017-01-01 00:05:00 | 4 | Chef turned on stove | planned |
| 5 | 2017-01-01 00:06:00 | 5 | Chef started cooking | planned |
| 5 | 2017-01-01 00:10:00 | NULL | Chef left kitchen | unplanned |
| 5 | 2017-01-01 01:00:00 | NULL | FIRE! | unplanned |
+-------------+---------------------+----------+------------------------+-----------+One way to do it is to take the last non-null value from the previous rows. Unfortunately, SQL Server doesn't implement
IGNORE NULLS clause in the LAST_VALUE function, so we have to use a workaround with MAX and rely on the fact that Position grows as ts grows. In a more general case it will be a bit more complicated, as Itzik Ben-Gan showed in his The Last non NULL Puzzle.The very first
NULL is changed to 0 (or some number that is less than any possible Position).Once we have values for all Positions, it is trivial to union this result with all omitted steps and order them.
In this approach all omitted steps go after unplanned steps in case of ambiguity, so
Turn off stove will appear last. If you choose to calculate MIN of following rows instead of MAX of previous, this rule will be swapped.Sample data
DECLARE @RecipeSteps TABLE (Id int, RecipeId int, Position int, Message nvarchar(255));
INSERT INTO @RecipeSteps (Id, RecipeId, Position, Message) VALUES
(50, 123, 1, 'Wash vegetables '),
(51, 123, 2, 'Peel vegetables '), --
(52, 123, 3, 'Cut meat '), --
(53, 123, 4, 'Turn on stove '), --
(54, 123, 5, 'Cook and stir it'), --
(55, 123, 6, 'Turn off stove ');
DECLARE @History TABLE (Id int, ts datetime2(0), Session int, StepId int, Message nvarchar(255));
INSERT INTO @History (Id, ts, Session, StepId, Message) VALUES
(90, '2017-01-01 00:01:00', 321, NULL, 'Chef entered kitchen '),
(91, '2017-01-01 00:02:00', 321, 51, 'Chef peeled vegetables'),
(92, '2017-01-01 00:03:00', 321, 52, 'Chef cut meat '),
(93, '2017-01-01 00:04:00', 321, NULL, 'Chef picked his nose '),
(94, '2017-01-01 00:05:00', 321, 53, 'Chef turned on stove '),
(95, '2017-01-01 00:06:00', 321, 54, 'Chef started cooking '),
(96, '2017-01-01 00:10:00', 321, NULL, 'Chef left kitchen '),
(97, '2017-01-01 01:00:00', 321, NULL, 'FIRE! ');Query
WITH
CTE_History
AS
(
SELECT
ISNULL(
MAX(Position)
OVER (ORDER BY ts ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
, 0) AS NewPosition
,H.ts
,S.Position
,H.Message AS NewMessage
,CASE WHEN S.Id IS NULL THEN 'unplanned' ELSE 'planned' END AS Comment
FROM
@History AS H
LEFT JOIN @RecipeSteps AS S ON S.Id = H.StepId
)
,CTE_OmittedSteps
AS
(
SELECT
Position AS NewPosition
,H.ts
,S.Position
,S.Message AS NewMessage
,'omitted' AS Comment
FROM
@RecipeSteps AS S
LEFT JOIN @History AS H ON S.Id = H.StepId
WHERE
H.Id IS NULL
)
SELECT * FROM CTE_History
UNION ALL
SELECT * FROM CTE_OmittedSteps
ORDER BY NewPosition, ts;Result
```
+-------------+---------------------+----------+------------------------+-----------+
| NewPosition | ts | Position | NewMessage | Comment |
+-------------+---------------------+----------+------------------------+-----------+
| 0 | 2017-01-01 00:01:00 | NULL | Chef entered kitchen | unplanned |
| 1 | NULL | 1 | Wash vegetables | omitted |
|
Code Snippets
+-------------+---------------------+----------+------------------------+-----------+
| NewPosition | ts | Position | NewMessage | Comment |
+-------------+---------------------+----------+------------------------+-----------+
| 0 | 2017-01-01 00:01:00 | NULL | Chef entered kitchen | unplanned |
| 2 | 2017-01-01 00:02:00 | 2 | Chef peeled vegetables | planned |
| 3 | 2017-01-01 00:03:00 | 3 | Chef cut meat | planned |
| 3 | 2017-01-01 00:04:00 | NULL | Chef picked his nose | unplanned |
| 4 | 2017-01-01 00:05:00 | 4 | Chef turned on stove | planned |
| 5 | 2017-01-01 00:06:00 | 5 | Chef started cooking | planned |
| 5 | 2017-01-01 00:10:00 | NULL | Chef left kitchen | unplanned |
| 5 | 2017-01-01 01:00:00 | NULL | FIRE! | unplanned |
+-------------+---------------------+----------+------------------------+-----------+DECLARE @RecipeSteps TABLE (Id int, RecipeId int, Position int, Message nvarchar(255));
INSERT INTO @RecipeSteps (Id, RecipeId, Position, Message) VALUES
(50, 123, 1, 'Wash vegetables '),
(51, 123, 2, 'Peel vegetables '), --
(52, 123, 3, 'Cut meat '), --
(53, 123, 4, 'Turn on stove '), --
(54, 123, 5, 'Cook and stir it'), --
(55, 123, 6, 'Turn off stove ');
DECLARE @History TABLE (Id int, ts datetime2(0), Session int, StepId int, Message nvarchar(255));
INSERT INTO @History (Id, ts, Session, StepId, Message) VALUES
(90, '2017-01-01 00:01:00', 321, NULL, 'Chef entered kitchen '),
(91, '2017-01-01 00:02:00', 321, 51, 'Chef peeled vegetables'),
(92, '2017-01-01 00:03:00', 321, 52, 'Chef cut meat '),
(93, '2017-01-01 00:04:00', 321, NULL, 'Chef picked his nose '),
(94, '2017-01-01 00:05:00', 321, 53, 'Chef turned on stove '),
(95, '2017-01-01 00:06:00', 321, 54, 'Chef started cooking '),
(96, '2017-01-01 00:10:00', 321, NULL, 'Chef left kitchen '),
(97, '2017-01-01 01:00:00', 321, NULL, 'FIRE! ');WITH
CTE_History
AS
(
SELECT
ISNULL(
MAX(Position)
OVER (ORDER BY ts ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
, 0) AS NewPosition
,H.ts
,S.Position
,H.Message AS NewMessage
,CASE WHEN S.Id IS NULL THEN 'unplanned' ELSE 'planned' END AS Comment
FROM
@History AS H
LEFT JOIN @RecipeSteps AS S ON S.Id = H.StepId
)
,CTE_OmittedSteps
AS
(
SELECT
Position AS NewPosition
,H.ts
,S.Position
,S.Message AS NewMessage
,'omitted' AS Comment
FROM
@RecipeSteps AS S
LEFT JOIN @History AS H ON S.Id = H.StepId
WHERE
H.Id IS NULL
)
SELECT * FROM CTE_History
UNION ALL
SELECT * FROM CTE_OmittedSteps
ORDER BY NewPosition, ts;+-------------+---------------------+----------+------------------------+-----------+
| NewPosition | ts | Position | NewMessage | Comment |
+-------------+---------------------+----------+------------------------+-----------+
| 0 | 2017-01-01 00:01:00 | NULL | Chef entered kitchen | unplanned |
| 1 | NULL | 1 | Wash vegetables | omitted |
| 2 | 2017-01-01 00:02:00 | 2 | Chef peeled vegetables | planned |
| 3 | 2017-01-01 00:03:00 | 3 | Chef cut meat | planned |
| 3 | 2017-01-01 00:04:00 | NULL | Chef picked his nose | unplanned |
| 4 | 2017-01-01 00:05:00 | 4 | Chef turned on stove | planned |
| 5 | 2017-01-01 00:06:00 | 5 | Chef started cooking | planned |
| 5 | 2017-01-01 00:10:00 | NULL | Chef left kitchen | unplanned |
| 5 | 2017-01-01 01:00:00 | NULL | FIRE! | unplanned |
| 6 | NULL | 6 | Turn off stove | omitted |
+-------------+---------------------+----------+------------------------+-----------+WITH
CTE_HistoryRaw
AS
(
SELECT
H.ts
,S.Position
,H.Message AS NewMessage
,CASE WHEN S.Id IS NULL THEN 'unplanned' ELSE 'planned' END AS Comment
FROM
@History AS H
LEFT JOIN @RecipeSteps AS S ON S.Id = H.StepId
)
,CTE_History
AS
(
SELECT
ISNULL(A.NewPosition, 0) AS NewPosition
,ts
,Position
,NewMessage
,Comment
FROM
CTE_HistoryRaw AS Curr
OUTER APPLY
(
SELECT TOP(1)
Prev.Position AS NewPosition
FROM CTE_HistoryRaw AS Prev
WHERE
Prev.ts <= Curr.ts
AND Prev.Position IS NOT NULL
ORDER BY Prev.ts DESC
) AS A
)
,CTE_OmittedSteps
AS
(
SELECT
Position AS NewPosition
,H.ts
,S.Position
,S.Message AS NewMessage
,'omitted' AS Comment
FROM
@RecipeSteps AS S
LEFT JOIN @History AS H ON S.Id = H.StepId
WHERE
H.Id IS NULL
)
SELECT * FROM CTE_History
UNION ALL
SELECT * FROM CTE_OmittedSteps
ORDER BY NewPosition, ts;Context
StackExchange Database Administrators Q#188858, answer score: 3
Revisions (0)
No revisions yet.