patternsqlMinor
function hangs with null case operation
Viewed 0 times
casewithnullfunctionoperationhangs
Problem
I created a function that accepts a start and end date, with the end date being optional. I then wrote a
When I call the function for the most recent month of the data:
... the query hangs. If I specify the end date:
... the result is returned normally. I took the code out of the function and ran it fine inside a query window. I can't duplicate the issue the fiddle either. A query like:
... also works fine.
Is there anything in the query (below) that could be causing the function to hang when a
SQL Fiddle
CASE in the filter to use the start date if no end date is passed. CASE WHEN @dateEnd IS NULL
THEN @dateStart
ELSE @dateEnd
ENDWhen I call the function for the most recent month of the data:
SELECT * FROM theFunction ('2013-06-01', NULL)... the query hangs. If I specify the end date:
SELECT * FROM theFunction ('2013-06-01', '2013-06-01')... the result is returned normally. I took the code out of the function and ran it fine inside a query window. I can't duplicate the issue the fiddle either. A query like:
SELECT * FROM theFunction ('2013-04-01', '2013-06-01')... also works fine.
Is there anything in the query (below) that could be causing the function to hang when a
NULL is passed for the end date?SQL Fiddle
- Execution plan for
SELECT * FROM theFunction ('2013-06-01', '2013-06-01')
- Estimated plan for
SELECT * FROM theFunction ('2013-06-01', NULL)
Solution
Part of your initial query is as follows.
That section of the plan is shown below
Your revised query
The difference seems to be that
And as there is an equi join predicate
For the
which it also applies as an implied predicate on
The next join along is
This discrepancy in estimates affects the join choice used. The
From the fact that the nested loops plan was still going after two hours this assumption of a single scan against
As for why the estimated number of rows between the two joins are so much lower I'm not sure without being able to see the statistics on the tables. The only way I managed to skew the estimated row counts that much in my testing was adding a load of
The estimated row count in the
Or in SQL
but this does not square with your comment that the column has no
FROM [dbo].[calendar] a
LEFT JOIN [dbo].[colleagueList] b
ON b.[Date] = a.d
WHERE DAY(a.[d]) = 1
AND a.[d] BETWEEN @dateStart AND COALESCE(@dateEnd,@dateStart)That section of the plan is shown below
Your revised query
BETWEEN @dateStart AND ISNULL(@dateEnd,@dateStart) has this for the same joinThe difference seems to be that
ISNULL simplifies further and as a result you get more accurate cardinality statistics going into the next join. This is an inline table valued function and you are calling it with literal values so it can do something like.a.[d] BETWEEN @dateStart AND ISNULL(@dateEnd,@dateStart)
a.[d] BETWEEN '2013-06-01' AND ISNULL(NULL,'2013-06-01')
a.[d] BETWEEN '2013-06-01' AND '2013-06-01'
a.[d] = '2013-06-01'And as there is an equi join predicate
b.[Date] = a.d the plan also shows an equality predicate b.[Date] = '2013-06-01'. As a result the cardinality estimate of 28,393 rows is likely to be pretty accurate.For the
CASE/COALESCE version when @dateStart and @dateEnd are the same value then it simplifies OK to the same equality expression and gives the same plan but when @dateStart = '2013-06-01' and @dateEnd IS NULL it only goes as far asa.[d]>='2013-06-01' AND a.[Date]<=CASE WHEN (1) THEN '2013-06-01' ELSE NULL ENDwhich it also applies as an implied predicate on
ColleagueList. The estimated number of rows this time is 79.8 rows.The next join along is
LEFT JOIN colleagueTime
ON colleagueTime.TC_DATE = colleagueList.Date
AND colleagueTime.ASSOC_ID = CAST(colleagueList.ID AS VARCHAR(10))colleagueTime is a 3,249,590 row table which is (again) apparently a heap with no useful indexes.This discrepancy in estimates affects the join choice used. The
ISNULL plan chooses a hash join that just scans the table once. The COALESCE plan chooses a nested loops join and estimates that it will still just need to scan the table once and be able to spool the result and replay it 78 times. i.e. it estimates that the correlated parameters will not change.From the fact that the nested loops plan was still going after two hours this assumption of a single scan against
colleagueTime seems likely to be highly inaccurate.As for why the estimated number of rows between the two joins are so much lower I'm not sure without being able to see the statistics on the tables. The only way I managed to skew the estimated row counts that much in my testing was adding a load of
NULL rows (this reduced the estimated row count even though the actual number of rows returned remained the same). The estimated row count in the
COALESCE plan with my test data was in the order of number of rows matching >= condition * 30% * (proportion of rows in the table not null)Or in SQL
SELECT 1E0 * COUNT([Date]) / COUNT(*) * ( COUNT(CASE
WHEN [Date] >= '2013-06-01' THEN 1
END) * 0.30 )
FROM [dbo].[colleagueList]but this does not square with your comment that the column has no
NULL values.Code Snippets
FROM [dbo].[calendar] a
LEFT JOIN [dbo].[colleagueList] b
ON b.[Date] = a.d
WHERE DAY(a.[d]) = 1
AND a.[d] BETWEEN @dateStart AND COALESCE(@dateEnd,@dateStart)a.[d] BETWEEN @dateStart AND ISNULL(@dateEnd,@dateStart)
a.[d] BETWEEN '2013-06-01' AND ISNULL(NULL,'2013-06-01')
a.[d] BETWEEN '2013-06-01' AND '2013-06-01'
a.[d] = '2013-06-01'a.[d]>='2013-06-01' AND a.[Date]<=CASE WHEN (1) THEN '2013-06-01' ELSE NULL ENDLEFT JOIN colleagueTime
ON colleagueTime.TC_DATE = colleagueList.Date
AND colleagueTime.ASSOC_ID = CAST(colleagueList.ID AS VARCHAR(10))number of rows matching >= condition * 30% * (proportion of rows in the table not null)Context
StackExchange Database Administrators Q#46486, answer score: 7
Revisions (0)
No revisions yet.