patternsqlMinor
Inline Scalar UDF Under RCSI - Can Results Differ
Viewed 0 times
canscalarunderrcsidifferinlineresultsudf
Problem
SQL Server 2019 introduces Scalar UDF inlining, aka "Froid." This ".. embeds [scalar UDFs] in the calling SQL query."
Previously, scalar UDFs ran under their own execution context, separate from the surrounding query's. One consequence of this is that under read committed snapshot isolation (RCSI) the function could see a different set of values than the containing query sees (link).
Is it possible that a query containing a scalar function, when running in RCSI with concurrent writes, could produce different results depending on whether the function is in-lined or not?
Previously, scalar UDFs ran under their own execution context, separate from the surrounding query's. One consequence of this is that under read committed snapshot isolation (RCSI) the function could see a different set of values than the containing query sees (link).
Is it possible that a query containing a scalar function, when running in RCSI with concurrent writes, could produce different results depending on whether the function is in-lined or not?
Solution
Yes, an in-lined function can show different results than its out-lined (!?) counterpart. The following reliably reproduces the situation on my machine (Windows 10, 4 cores + HT @ 2GHz, 16GB RAM, SSD).
Configure the database and session to use Read Committed Snapshot Isolation (RCSI):
This table will give a shared object on which concurrent workloads can act.
This table is to capture the results from the test, hopefully revealing divergent behaviour between functions that are in-lined and not:
To demonstrate the different behaviour I want two function executed within a single SELECT, one of which in in-line and the other isn't. The documentation says
A scalar T-SQL UDF can be inline if .. the UDF does not invoke any intrinsic function .. such as GETDATE()
To ensure one UDF cannot be in-line I add a reference to GETDATE. Note that this additional statement plays no part in the logic of the UDF, it simply suppresses in-lining. (Indeed, this function could be optimised away. Perhaps some future release will implement just such a optimization?)
To reference the shared table I've arbitrarily chosen to use SUM. I believe, but haven't tested, that any other technique that surfaces differences in the rows seen by the functions and the containing SELECT (MIN, MAX, TOP(1)) would do just as well.
Next I start two sessions. The first is to perform the SELECT, the second to make concurrent writes against the shared table.
I set running the session performing writes. On my machine it runs for about 24s, which is ample time to switch to session 1 (the reads) and start it.
For one run over 99 SELECTs there are 12 instances where the in-line and traditional execution mechanism return different results. In every case the in-lined function returns the same result as the containing query (which is not to say this test shows such behaviour is guaranteed).
Configure the database and session to use Read Committed Snapshot Isolation (RCSI):
alter database Sandpit
set read_committed_snapshot on
with rollback immediate;
GO
set transaction isolation level read committed;
GOThis table will give a shared object on which concurrent workloads can act.
drop table if exists t;
go
create table t(c int);
insert t(c) values (1);
goThis table is to capture the results from the test, hopefully revealing divergent behaviour between functions that are in-lined and not:
drop table if exists #Out;
go
create table #Out(Base int, Old int, New int);
goTo demonstrate the different behaviour I want two function executed within a single SELECT, one of which in in-line and the other isn't. The documentation says
A scalar T-SQL UDF can be inline if .. the UDF does not invoke any intrinsic function .. such as GETDATE()
To ensure one UDF cannot be in-line I add a reference to GETDATE. Note that this additional statement plays no part in the logic of the UDF, it simply suppresses in-lining. (Indeed, this function could be optimised away. Perhaps some future release will implement just such a optimization?)
create or alter function dbo.Old_skool()
returns int
as
begin
declare @tot int = 0;
declare @d date = GETDATE(); -- inhibits in-lining
select @tot = SUM(C) from t;
return @tot;
end
go
create or alter function dbo.New_kid_on_the_block()
returns int
as
begin
declare @tot int = 0;
select @tot = SUM(C) from t;
return @tot;
end
goTo reference the shared table I've arbitrarily chosen to use SUM. I believe, but haven't tested, that any other technique that surfaces differences in the rows seen by the functions and the containing SELECT (MIN, MAX, TOP(1)) would do just as well.
Next I start two sessions. The first is to perform the SELECT, the second to make concurrent writes against the shared table.
-- Session 1 for reads
set transaction isolation level read committed;
GO
truncate table #Out;
declare @c int = 0;
while @c < 99 -- large enough to exhibit the behaviour
begin
insert #Out(Base, Old, New)
select
c,
dbo.Old_skool(),
dbo.New_kid_on_the_block()
from t;
set @c += 1;
end
-- Session 2 for writes
declare @c int = 0;
while @c < 99999
begin
update t set c = c + 1;
set @c += 1;
endI set running the session performing writes. On my machine it runs for about 24s, which is ample time to switch to session 1 (the reads) and start it.
For one run over 99 SELECTs there are 12 instances where the in-line and traditional execution mechanism return different results. In every case the in-lined function returns the same result as the containing query (which is not to say this test shows such behaviour is guaranteed).
Base Old New
----------- ----------- -----------
1801 1802 1801
1803 1804 1803
1814 1815 1814
1841 1842 1841
1856 1857 1856
1857 1858 1857
1860 1861 1860
1861 1862 1861
1864 1865 1864
1883 1884 1883
1884 1885 1884
1890 1891 1890Code Snippets
alter database Sandpit
set read_committed_snapshot on
with rollback immediate;
GO
set transaction isolation level read committed;
GOdrop table if exists t;
go
create table t(c int);
insert t(c) values (1);
godrop table if exists #Out;
go
create table #Out(Base int, Old int, New int);
gocreate or alter function dbo.Old_skool()
returns int
as
begin
declare @tot int = 0;
declare @d date = GETDATE(); -- inhibits in-lining
select @tot = SUM(C) from t;
return @tot;
end
go
create or alter function dbo.New_kid_on_the_block()
returns int
as
begin
declare @tot int = 0;
select @tot = SUM(C) from t;
return @tot;
end
go-- Session 1 for reads
set transaction isolation level read committed;
GO
truncate table #Out;
declare @c int = 0;
while @c < 99 -- large enough to exhibit the behaviour
begin
insert #Out(Base, Old, New)
select
c,
dbo.Old_skool(),
dbo.New_kid_on_the_block()
from t;
set @c += 1;
end
-- Session 2 for writes
declare @c int = 0;
while @c < 99999
begin
update t set c = c + 1;
set @c += 1;
endContext
StackExchange Database Administrators Q#252623, answer score: 6
Revisions (0)
No revisions yet.