snippetsqlMajor
How to efficiently check EXISTS on multiple columns?
Viewed 0 times
efficientlyhowcolumnsmultipleexistscheck
Problem
This is an issue I come up against periodically and have not yet found a good solution for.
Supposing the following table structure
and the requirement is to determine whether either of the nullable columns
Also assume the table contains millions of rows (and that no column statistics are available that could be peeked at as I am interested in a more generic solution for this class of queries).
I can think of a few ways of approaching this but all have weaknesses.
Two separate
Single Aggregate Query
This could process both columns at the same time so have a worst case of one full scan.
The disadvantage is that even if it encounters a
User variables
I can think of a third way of doing this
but this is not suitable for production code as the correct behavior for an aggregate concatenation query is undefined. and terminating the
Supposing the following table structure
CREATE TABLE T
(
A INT PRIMARY KEY,
B CHAR(1000) NULL,
C CHAR(1000) NULL
)and the requirement is to determine whether either of the nullable columns
B or C actually contain any NULL values (and if so which one(s)). Also assume the table contains millions of rows (and that no column statistics are available that could be peeked at as I am interested in a more generic solution for this class of queries).
I can think of a few ways of approaching this but all have weaknesses.
Two separate
EXISTS statements. This would have the advantage of allowing the queries to stop scanning early as soon as a NULL is found. But if both columns in fact contain no NULLs then two full scans will result.Single Aggregate Query
SELECT
MAX(CASE WHEN B IS NULL THEN 1 ELSE 0 END) AS B,
MAX(CASE WHEN C IS NULL THEN 1 ELSE 0 END) AS C
FROM TThis could process both columns at the same time so have a worst case of one full scan.
The disadvantage is that even if it encounters a
NULL in both columns very early on the query will still end up scanning the whole of the rest of the table. User variables
I can think of a third way of doing this
BEGIN TRY
DECLARE @B INT, @C INT, @D INT
SELECT
@B = CASE WHEN B IS NULL THEN 1 ELSE @B END,
@C = CASE WHEN C IS NULL THEN 1 ELSE @C END,
/*Divide by zero error if both @B and @C are 1.
Might happen next row as no guarantee of order of
assignments*/
@D = 1 / (2 - (@B + @C))
FROM T
OPTION (MAXDOP 1)
END TRY
BEGIN CATCH
IF ERROR_NUMBER() = 8134 /*Divide by zero*/
BEGIN
SELECT 'B,C both contain NULLs'
RETURN;
END
ELSE
RETURN;
END CATCH
SELECT ISNULL(@B,0),
ISNULL(@C,0)but this is not suitable for production code as the correct behavior for an aggregate concatenation query is undefined. and terminating the
Solution
How about:
SELECT TOP 3 *
FROM (SELECT DISTINCT
CASE WHEN B IS NULL THEN NULL ELSE 'foo' END AS B
, CASE WHEN C IS NULL THEN NULL ELSE 'bar' END AS C
FROM T
WHERE
(B IS NULL AND C IS NOT NULL)
OR (B IS NOT NULL AND C IS NULL)
OR (B IS NULL AND C IS NULL)
) AS DTCode Snippets
SELECT TOP 3 *
FROM (SELECT DISTINCT
CASE WHEN B IS NULL THEN NULL ELSE 'foo' END AS B
, CASE WHEN C IS NULL THEN NULL ELSE 'bar' END AS C
FROM T
WHERE
(B IS NULL AND C IS NOT NULL)
OR (B IS NOT NULL AND C IS NULL)
OR (B IS NULL AND C IS NULL)
) AS DTContext
StackExchange Database Administrators Q#19344, answer score: 20
Revisions (0)
No revisions yet.