patternsqlMinor
Interchanging IN with EXISTS produces different result sets
Viewed 0 times
resultwithexistsinterchangingdifferentproducessets
Problem
I am attempting to update a query that utilizes the
I'm noticing that when the query is ran with the
The use case is relatively straightforward: the query accepts a pipe delimited
To illustrate, here is a similar table definition:
Populate the test table with dummy data.
```
SET IDENTITY_INSERT [dbo].[Document] ON;
;WITH [DocumentSeed] AS (
SELECT
1 AS [DocumentId]
UNION ALL
SELECT
[DocumentId] + 1
FROM
[DocumentSeed]
WHERE
[DocumentId] < 2048)
INSERT INTO [dbo].[Document] ([DocumentId], [DocumentSeriesId])
SELECT
[DocumentId]
,ABS(CHECKSUM(NEWID()) % 4) + 1
FROM
[DocumentSeed] OPTION (MAXRECURSION 2048);
S
IN operator within a WHERE clause predicate with EXISTS to compare potential performance improvements and better understand what is happening behind the scenes when the two are interchanged. It is my understanding that in practice, the query optimizer treats EXISTS and IN the same way whenever it can.I'm noticing that when the query is ran with the
IN operator, it returns the desired result set. However, when I replace it with the EXISTS equivalent, it pulls in all values from the primary table I want to filter. It is ignoring the provided input values passed to EXISTS (SELECT ... and returning all possible distinct values.The use case is relatively straightforward: the query accepts a pipe delimited
@Series string that can contain up to 4 values, e.g. S1|S2|S3|S4 or S2|S4. From here I string_split the input into a table variable @SeriesSplit to determine the corresponding internal [SeriesId] for the [Series]. The result set returned is then filtered to exclude the [Series] that were not passed.To illustrate, here is a similar table definition:
DROP TABLE IF EXISTS [dbo].[Document]
IF OBJECT_ID('[dbo].[Document]', 'U') IS NULL
BEGIN
CREATE TABLE [dbo].[Document] (
[DocumentId] bigint IDENTITY(1,1) NOT NULL
,[DocumentSeriesId] [tinyint] NOT NULL
,CONSTRAINT [PK_Document] PRIMARY KEY CLUSTERED ([DocumentId] ASC)
,INDEX [IX_Document_SeriesId] NONCLUSTERED ([DocumentSeriesId] ASC)
);
END;
GOPopulate the test table with dummy data.
```
SET IDENTITY_INSERT [dbo].[Document] ON;
;WITH [DocumentSeed] AS (
SELECT
1 AS [DocumentId]
UNION ALL
SELECT
[DocumentId] + 1
FROM
[DocumentSeed]
WHERE
[DocumentId] < 2048)
INSERT INTO [dbo].[Document] ([DocumentId], [DocumentSeriesId])
SELECT
[DocumentId]
,ABS(CHECKSUM(NEWID()) % 4) + 1
FROM
[DocumentSeed] OPTION (MAXRECURSION 2048);
S
Solution
You didn't rewrite the query properly. When converting an
Your query basically says
Give me all data from
It should be:
IN subqquery to an EXISTS one, your EXISTS subquery should be correlated. But you defined the Document table again in the subquery, making it uncorrelated. Your query basically says
Give me all data from
Documents if @SeriesSplit has at least one row - and nothing if @SeriesSplit is empty.It should be:
SELECT DISTINCT
D1.[DocumentSeriesId]
FROM
[dbo].[Document] D1
WHERE
EXISTS (SELECT 1 FROM @SeriesSplit SS
WHERE SS.[SeriesId] = D1.[DocumentSeriesId]) ;Code Snippets
SELECT DISTINCT
D1.[DocumentSeriesId]
FROM
[dbo].[Document] D1
WHERE
EXISTS (SELECT 1 FROM @SeriesSplit SS
WHERE SS.[SeriesId] = D1.[DocumentSeriesId]) ;Context
StackExchange Database Administrators Q#203070, answer score: 5
Revisions (0)
No revisions yet.