snippetsqlModerate
How can I force a scalar UDF to be evaluated just once in a query?
Viewed 0 times
oncecanforceevaluatedjustscalarqueryhowudf
Problem
I have a query which needs to filter against the result of a scalar UDF. The query must be sent as a single statement (so I can't assign the UDF result to a local variable) and I cannot use a TVF. I am aware of the performance issues caused by scalar UDFs, which include forcing the entire plan to be run serially, excessive memory grants, cardinality estimation problems, and lack of inlining. For this question please assume that I need to use a scalar UDF.
The UDF itself is pretty expensive to call but in theory the queries can be logically implemented by the optimizer in such a way that the function only needs to be calculated once. I have mocked up a greatly simplified example for this question. The following query takes 6152 ms to execute on my machine:
The filter operator in the query plan suggests that the function was evaluated once for each row:
DDL and data prep:
Here is a db fiddle link for the above example, although the code takes about 18 seconds t
The UDF itself is pretty expensive to call but in theory the queries can be logically implemented by the optimizer in such a way that the function only needs to be calculated once. I have mocked up a greatly simplified example for this question. The following query takes 6152 ms to execute on my machine:
SELECT x1.ID
FROM dbo.X_100_INTEGERS x1
WHERE x1.ID >= dbo.EXPENSIVE_UDF();The filter operator in the query plan suggests that the function was evaluated once for each row:
DDL and data prep:
CREATE OR ALTER FUNCTION dbo.EXPENSIVE_UDF () RETURNS INT
AS
BEGIN
DECLARE @tbl TABLE (VAL VARCHAR(5));
-- make the function expensive to call
INSERT INTO @tbl
SELECT [VALUE]
FROM STRING_SPLIT(REPLICATE(CAST('Z ' AS VARCHAR(MAX)), 20000), ' ');
RETURN 1;
END;
GO
DROP TABLE IF EXISTS dbo.X_100_INTEGERS;
CREATE TABLE dbo.X_100_INTEGERS (ID INT NOT NULL);
-- insert 100 integers from 1 - 100
WITH
L0 AS(SELECT 1 AS c UNION ALL SELECT 1),
L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
L4 AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),
L5 AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),
Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n FROM L5)
INSERT INTO dbo.X_100_INTEGERS WITH (TABLOCK)
SELECT n FROM Nums WHERE n <= 100;Here is a db fiddle link for the above example, although the code takes about 18 seconds t
Solution
Ultimately, it is not possible to force SQL Server to evaluate a scalar UDF just once in a query. However, there are some steps which can be taken to encourage it. With testing I believe that you can get something that works with the current version of SQL Server, but it's possible that future changes will require you to revisit your code.
If it's possible to edit the code a good first thing to try is to make the function deterministic if possible. Paul White points out here that the function must be created with the
After making the following change:
The query from the question is executed in 64 ms:
The query plan no longer has the filter operator:
To be sure that it executed only once we can use the new sys.dm_exec_function_stats DMV released in SQL Server 2016:
Issuing an
Note that just because the function is deterministic does not mean that it will be evaluated only once for any query. In fact, for some queries adding
The superfluous
Based on that, one would expect the UDF to be evaluated once and to be used as the outer table in the nested loop join. However, the query takes 6446 ms to run on my machine. According to
More often than not, a Compute Scalar simply defines an expression; the actual computation is deferred until something later in the execution plan needs the result.
For this query it looks like the UDF call was deferred until it was needed, at which point it was evaluated 100 times.
Interestingly, the CTE example executes in 71 ms on my machine when the UDF is not defined with
It's not clear why the Compute Scalar isn't deferred. It could be because the nondeterminism of the function limits the rearranging of operators that the query optimizer can do.
An alternative approach is to add a small table to the CTE and to query the only row in that table. Any small table will do, but let's use the following:
The query then becomes:
The addition of the
The query executes in about 110 ms on my machine and the UDF is evaluated only once according to
In summary, for deterministic functions (which must include the
For functions not considered by SQL Server to b
If it's possible to edit the code a good first thing to try is to make the function deterministic if possible. Paul White points out here that the function must be created with the
SCHEMABINDING option and the function code itself must be deterministic.After making the following change:
CREATE OR ALTER FUNCTION dbo.EXPENSIVE_UDF () RETURNS INT
WITH SCHEMABINDING
AS
BEGIN
DECLARE @tbl TABLE (VAL VARCHAR(5));
-- make the function expensive to call
INSERT INTO @tbl
SELECT [VALUE]
FROM STRING_SPLIT(REPLICATE(CAST('Z ' AS VARCHAR(MAX)), 20000), ' ');
RETURN 1;
END;The query from the question is executed in 64 ms:
SELECT x1.ID
FROM dbo.X_100_INTEGERS x1
WHERE x1.ID >= dbo.EXPENSIVE_UDF();The query plan no longer has the filter operator:
To be sure that it executed only once we can use the new sys.dm_exec_function_stats DMV released in SQL Server 2016:
SELECT execution_count
FROM sys.dm_exec_function_stats
WHERE object_id = OBJECT_ID('EXPENSIVE_UDF', 'FN');Issuing an
ALTER against the function will reset the execution_count for that object. The above query returns 1 which means the function was only executed once.Note that just because the function is deterministic does not mean that it will be evaluated only once for any query. In fact, for some queries adding
SCHEMABINDING can degrade performance. Consider the following query:WITH cte (UDF_VALUE) AS
(
SELECT DISTINCT dbo.EXPENSIVE_UDF() UDF_VALUE
)
SELECT ID
FROM dbo.X_100_INTEGERS
INNER JOIN cte ON ID >= cte.UDF_VALUE;The superfluous
DISTINCT was added to get rid of a Filter operator. The plan looks promising:Based on that, one would expect the UDF to be evaluated once and to be used as the outer table in the nested loop join. However, the query takes 6446 ms to run on my machine. According to
sys.dm_exec_function_stats the function was executed 100 times. How that is that possible? In "Compute Scalars, Expressions and Execution Plan Performance", Paul White points out that the Compute Scalar operator can be deferred: More often than not, a Compute Scalar simply defines an expression; the actual computation is deferred until something later in the execution plan needs the result.
For this query it looks like the UDF call was deferred until it was needed, at which point it was evaluated 100 times.
Interestingly, the CTE example executes in 71 ms on my machine when the UDF is not defined with
SCHEMABINDING, as in the original question. The function is only executed once when the query is run. Here is the query plan for that:It's not clear why the Compute Scalar isn't deferred. It could be because the nondeterminism of the function limits the rearranging of operators that the query optimizer can do.
An alternative approach is to add a small table to the CTE and to query the only row in that table. Any small table will do, but let's use the following:
CREATE TABLE dbo.X_ONE_ROW_TABLE (ID INT NOT NULL);
INSERT INTO dbo.X_ONE_ROW_TABLE VALUES (1);The query then becomes:
WITH cte (UDF_VALUE) AS
(
SELECT DISTINCT dbo.EXPENSIVE_UDF() UDF_VALUE
FROM dbo.X_ONE_ROW_TABLE
)
SELECT ID
FROM dbo.X_100_INTEGERS
INNER JOIN cte ON ID >= cte.UDF_VALUE;The addition of the
dbo.X_ONE_ROW_TABLE adds uncertainty for the optimizer. If the table has zero rows then the CTE will return 0 rows. In any case, the optimizer cannot guarantee that the CTE will return one row if the UDF is not deterministic, so it seems likely that the UDF will be evaluated before the join. I would expect the optimizer to scan dbo.X_ONE_ROW_TABLE, use a stream aggregate to get the maximum value of the one row returned (which requires the function to be evaluated), and to use that as the outer table for a nested loop join to dbo.X_100_INTEGERS in the main query. This appears to be what happens:The query executes in about 110 ms on my machine and the UDF is evaluated only once according to
sys.dm_exec_function_stats. It would be incorrect to say that the query optimizer is forced to evaluate the UDF only once. However, it is hard to imagine an optimizer rewrite that would lead to a lower cost query, even with the limitations around UDF and compute scalar costing.In summary, for deterministic functions (which must include the
SCHEMABINDING option) try writing the query in as simple of a way as possible. If on SQL Server 2016 or a later version, confirm that the function was only executed once using sys.dm_exec_function_stats. Execution plans can be misleading in that regard.For functions not considered by SQL Server to b
Code Snippets
CREATE OR ALTER FUNCTION dbo.EXPENSIVE_UDF () RETURNS INT
WITH SCHEMABINDING
AS
BEGIN
DECLARE @tbl TABLE (VAL VARCHAR(5));
-- make the function expensive to call
INSERT INTO @tbl
SELECT [VALUE]
FROM STRING_SPLIT(REPLICATE(CAST('Z ' AS VARCHAR(MAX)), 20000), ' ');
RETURN 1;
END;SELECT x1.ID
FROM dbo.X_100_INTEGERS x1
WHERE x1.ID >= dbo.EXPENSIVE_UDF();SELECT execution_count
FROM sys.dm_exec_function_stats
WHERE object_id = OBJECT_ID('EXPENSIVE_UDF', 'FN');WITH cte (UDF_VALUE) AS
(
SELECT DISTINCT dbo.EXPENSIVE_UDF() UDF_VALUE
)
SELECT ID
FROM dbo.X_100_INTEGERS
INNER JOIN cte ON ID >= cte.UDF_VALUE;CREATE TABLE dbo.X_ONE_ROW_TABLE (ID INT NOT NULL);
INSERT INTO dbo.X_ONE_ROW_TABLE VALUES (1);Context
StackExchange Database Administrators Q#166342, answer score: 18
Revisions (0)
No revisions yet.