patternsqlMinor
SQL Server Scalar UDF Parallelism Mystery!
Viewed 0 times
mysterysqlscalarserverparallelismudf
Problem
Referencing this blog post by Erik Darling on the website of my favorite SQL Server guru - Brent Ozar:
When you select from that table alone, it shows "CouldNotGenerateValidParallelPlan".
But, when you join that table to another that does not have a check constraint/computed column calling a Scalar UDF, the query goes parallel with "Good Enough Plan Found"
When you select from that table alone, it shows "CouldNotGenerateValidParallelPlan".
But, when you join that table to another that does not have a check constraint/computed column calling a Scalar UDF, the query goes parallel with "Good Enough Plan Found"
USE tempdb;
SET NOCOUNT ON;
SELECT TOP 10000
ROW_NUMBER() OVER ( ORDER BY ( SELECT NULL )) AS ID, DATEADD(MINUTE, m.message_id, SYSDATETIME()) AS SomeDate
INTO dbo.constraint_test_1
FROM sys.messages AS m, sys.messages AS m2;
GO
SELECT TOP 10000
ROW_NUMBER() OVER ( ORDER BY ( SELECT NULL )) AS ID, DATEADD(MINUTE, m.message_id, SYSDATETIME()) AS SomeDate
INTO dbo.constraint_test_2
FROM sys.messages AS m, sys.messages AS m2;
GO
CREATE FUNCTION dbo.DateCheck ( @d DATETIME2(7))
RETURNS BIT
WITH RETURNS NULL ON NULL INPUT
AS
BEGIN
DECLARE @Out BIT;
SELECT @Out = CASE WHEN @d < DATEADD(DAY, 30, SYSDATETIME()) THEN 1 ELSE 0 END;
RETURN @Out;
END;
GO
ALTER TABLE dbo.constraint_test_1 ADD CONSTRAINT ck_cc_dt CHECK ( dbo.DateCheck(SomeDate) = 1 );
SELECT *
FROM dbo.constraint_test_1
OPTION (QUERYTRACEON 8649, MAXDOP 0, RECOMPILE); -- Does not go parallel
SELECT T1.ID, T2.SomeDate
FROM dbo.constraint_test_1 T1 INNER JOIN
dbo.constraint_test_2 T2 ON T1.ID = T2.ID
OPTION (QUERYTRACEON 8649, MAXDOP 0, RECOMPILE); -- Goes parallelSolution
You need a non-trivial plan if you want parallelism, plus the optimizer must be able to find a valid parallel plan for the query.
For example, modifying your first query (which did not use parallelism):
I changed the
Your second query (which did go parallel) can be modified to be always serial due to the constraint:
I added T1.SomeDate to the projection list, so the problematic column is now used by the query. When the column is not needed, the constraint on it is disregarded by the optimizer, so it can find a parallel plan.
The NonParallelPlanReason show plan attribute may be added early in compilation when the process detects a condition that would prevent parallelism. For example, the following query produces the reason ParallelismDisabledByTraceFlag:
The presence of a non-parallel reason does not mean parallelism would definitely have been considered otherwise. The optimizer only considers parallelism after the trivial plan stage, optionally search 0, and definitely search 1. For more information on optimizer stages please see my Query Optimizer Deep Dive series.
You mentioned in a comment that you are interested in the behaviour of computed columns referencing scalar functions with parallelism. I describe this in Properly Persisted Computed Columns.
Finally, not a particular concern here, but if you want to test things like this you're better off doing it in a real user database not tempdb. There are several things that work differently in tempdb.
For example, modifying your first query (which did not use parallelism):
SELECT COUNT_BIG(*)
FROM dbo.constraint_test_1
WHERE ID > (SELECT 0)
OPTION (QUERYTRACEON 8649, MAXDOP 0, RECOMPILE); -- Goes parallelI changed the
SELECT to SELECT COUNT_BIG() there to avoid projecting the column with the scalar UDF constraint. I also added a WHERE clause complex enough to skip the trivial plan stage.Your second query (which did go parallel) can be modified to be always serial due to the constraint:
SELECT T1.ID, T2.SomeDate, T1.SomeDate
FROM dbo.constraint_test_1 T1 INNER JOIN
dbo.constraint_test_2 T2 ON T1.ID = T2.ID
OPTION (QUERYTRACEON 8649, MAXDOP 0, RECOMPILE); -- No parallelI added T1.SomeDate to the projection list, so the problematic column is now used by the query. When the column is not needed, the constraint on it is disregarded by the optimizer, so it can find a parallel plan.
The NonParallelPlanReason show plan attribute may be added early in compilation when the process detects a condition that would prevent parallelism. For example, the following query produces the reason ParallelismDisabledByTraceFlag:
SELECT CT.ID
FROM dbo.constraint_test_1 AS CT
OPTION (QUERYTRACEON 8687);The presence of a non-parallel reason does not mean parallelism would definitely have been considered otherwise. The optimizer only considers parallelism after the trivial plan stage, optionally search 0, and definitely search 1. For more information on optimizer stages please see my Query Optimizer Deep Dive series.
You mentioned in a comment that you are interested in the behaviour of computed columns referencing scalar functions with parallelism. I describe this in Properly Persisted Computed Columns.
Finally, not a particular concern here, but if you want to test things like this you're better off doing it in a real user database not tempdb. There are several things that work differently in tempdb.
Code Snippets
SELECT COUNT_BIG(*)
FROM dbo.constraint_test_1
WHERE ID > (SELECT 0)
OPTION (QUERYTRACEON 8649, MAXDOP 0, RECOMPILE); -- Goes parallelSELECT T1.ID, T2.SomeDate, T1.SomeDate
FROM dbo.constraint_test_1 T1 INNER JOIN
dbo.constraint_test_2 T2 ON T1.ID = T2.ID
OPTION (QUERYTRACEON 8649, MAXDOP 0, RECOMPILE); -- No parallelSELECT CT.ID
FROM dbo.constraint_test_1 AS CT
OPTION (QUERYTRACEON 8687);Context
StackExchange Database Administrators Q#306925, answer score: 8
Revisions (0)
No revisions yet.