patternsqlMinor
Understanding NOT EXISTS
Viewed 0 times
understandingnotexists
Problem
Consider the following table,
I am struggling to understand application of the
The following query is meant to find the starting points of 'islands':
Now the query within the NOT EXISTS condition,
returns, the column (let's call this column A) SeqNo as:
The 'full' column (let's call this column B) for SeqNo is:
Why then, when we take column B values WHERE NOT EXISTS in column A, do we find the column
The value 1 in column B is also IN column A, so then why is the value 1 appearing in the column SeqNo of the query as a whole?
CREATE TABLE GapsIslands (ID INT NOT NULL, SeqNo INT NOT NULL);
ALTER TABLE GapsIslands ADD CONSTRAINT pk_GapsIslands PRIMARY KEY (ID, SeqNo);
INSERT INTO GapsIslands
SELECT 1, 1 UNION ALL SELECT 1, 2 UNION ALL SELECT 1, 5 UNION ALL SELECT 1, 6
UNION ALL SELECT 1, 8 UNION ALL SELECT 1, 9 UNION ALL SELECT 1, 10 UNION ALL SELECT 1, 12
UNION ALL SELECT 1, 20 UNION ALL SELECT 1, 21 UNION ALL SELECT 1, 25 UNION ALL SELECT 1, 26;I am struggling to understand application of the
NOT EXISTS condition in the context of a Gaps and Islands solution.The following query is meant to find the starting points of 'islands':
SELECT ID, SeqNo, ROW_NUMBER() OVER (ORDER BY SeqNo) AS RowNum
FROM GapsIslands AS a
WHERE NOT EXISTS (
SELECT *
FROM GapsIslands AS b
WHERE b.ID = a.ID AND b.SeqNo = a.SeqNo - 1)Now the query within the NOT EXISTS condition,
SELECT *
FROM GapsIslands AS b
WHERE b.ID = a.ID AND b.SeqNo = a.SeqNo - 1returns, the column (let's call this column A) SeqNo as:
SeqNo
1
5
8
9
20
25The 'full' column (let's call this column B) for SeqNo is:
SeqNo
1
2
5
6
8
9
10
12
20
21
25
26Why then, when we take column B values WHERE NOT EXISTS in column A, do we find the column
SeqNo
1
5
8
12
20
25The value 1 in column B is also IN column A, so then why is the value 1 appearing in the column SeqNo of the query as a whole?
Solution
Let's move the problematic expression from
fiddle
The query gives us the next output:
ID | SeqNo | RowNum | previous_SeqNo_exists
-: | ----: | :----- | :--------------------
1 | 1 | 1 | Not exists
1 | 2 | 2 | Exists
1 | 5 | 3 | Not exists
1 | 6 | 4 | Exists
1 | 8 | 5 | Not exists
1 | 9 | 6 | Exists
1 | 10 | 7 | Exists
1 | 12 | 8 | Not exists
1 | 20 | 9 | Not exists
1 | 21 | 10 | Exists
1 | 25 | 11 | Not exists
1 | 26 | 12 | Exists
It seems it is absolutely clear. SeqNo=1 have no previous value, because there is no such record at all, SeqNo=5 (and all another) - because there is a gap before.
When we use this
PS. The query under consideration (posted by OP) does not contain ORDER BY clause. That is bad. This may cause the records will be returned in random order. Nevertheless record numbers calculated by
WHERE condition to SELECT output list. EXISTS gives boolean, and SQL server don't want to display it directly, so we'll use CASE and convert it to readable form.SELECT ID, SeqNo, ROW_NUMBER() OVER (ORDER BY SeqNo) AS RowNum
/ There we insert our EXISTS moved from WHERE /
/ and wrapped into CASE /
, CASE WHEN EXISTS ( SELECT *
FROM GapsIslands AS b
WHERE b.ID = a.ID
AND b.SeqNo = a.SeqNo - 1 )
THEN 'Exists' / EXISTS is true there, NOT EXISIS is false /
ELSE 'Not exists' / NOT EXISTS is true there, EXISIS is false /
END AS previous_SeqNo_exists
/ ----------------------------------------------- /
FROM GapsIslands AS a
ORDER BY SeqNo
fiddle
The query gives us the next output:
ID | SeqNo | RowNum | previous_SeqNo_exists
-: | ----: | :----- | :--------------------
1 | 1 | 1 | Not exists
1 | 2 | 2 | Exists
1 | 5 | 3 | Not exists
1 | 6 | 4 | Exists
1 | 8 | 5 | Not exists
1 | 9 | 6 | Exists
1 | 10 | 7 | Exists
1 | 12 | 8 | Not exists
1 | 20 | 9 | Not exists
1 | 21 | 10 | Exists
1 | 25 | 11 | Not exists
1 | 26 | 12 | Exists
It seems it is absolutely clear. SeqNo=1 have no previous value, because there is no such record at all, SeqNo=5 (and all another) - because there is a gap before.
When we use this
EXISTS expression in WHERE the records with "Exists" will be removed, and we will obtain only records with "Not exists" (do not forget that there is additional NOT operator, so only records which gives FALSE will be returned).ROW_NUMBER(), which works after WHERE, simply enumerates returned records in given ordering.PS. The query under consideration (posted by OP) does not contain ORDER BY clause. That is bad. This may cause the records will be returned in random order. Nevertheless record numbers calculated by
ROW_NUMBER() will be correct because the function have its own, local, ordering.Context
StackExchange Database Administrators Q#256232, answer score: 5
Revisions (0)
No revisions yet.