HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Minimise duration of table lock

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
minimisetabledurationlock

Problem

I've got a SQL table that needs to be updated daily.
There may or may not be queries against that table while the update is happening. It's around 500,000 rows.

We have an issue where there is locking conflict when the job to update the table is running at the same time as a query against it.

So I have rewritten the process to update the table as follows:

```
ALTER procedure [dbo].[Table_Generate] as

declare @d datetime = getdate(), @c as int

--Check temp tables
IF OBJECT_ID('tempdb..#final') IS NOT NULL
DROP TABLE #final

IF OBJECT_ID('tempdb..#base') IS NOT NULL
DROP TABLE #base

--Get source data from linked server
select
ID,
Reference,
StartDate,
EndDate,
Description,
SomeCode
into #base

from [LinkedServer].[Database].dbo.[A_View]

--Generate row_hash
select
ID,
Reference,
StartDate,
EndDate,
Description,
SomeCode,
hashbytes('SHA2_256',(
select
ID,
Reference,
StartDate,
EndDate,
Description,
SomeCode
from #base sub where sub.ID = main.ID for xml raw)) as row_hash
into #final
from #base main

select @c = count(*) from #final
if @c >0 begin

merge [The_Table_Staging] as target
using #final as source
on source.ID = target.ID

--New rows
when not matched by target then
insert ( RunDate,
ID,
Reference,
StartDate,
EndDate,
Description,
SomeCode,
Row_Hash
) values (
@d,
source.ID,
source.Reference,
source.StartDate,
source.EndDate,
source.Description,
source.SomeCode,
source.row_hash)

--Existing changed rows
when matched and source.row_hash != target.row_hash then update set
target.RunDate = @d
,target.Reference = source.Reference
,target.StartDate = source.StartDate
,target.EndDate

Solution

Unfortunately I see a lot of sins in your code at a quick glance:

-
There's a multitude of known bugs with the MERGE statement that it really should be avoided altogether in production code.

-
MERGE is also known to be less performant than writing the individual INSERT, UPDATE, and DELETE statements; which may be the reasoning for some of your blocking issues.

-
While tempting, using the sp_rename function to minimize blocking issues can actually result in worse blocking issues, as discussed in Kendra Little's Why You Should Switch in Staging Tables Instead of Renaming Them. (If I recall correctly, this discusses using partition switching as a better solution.)

-
Linked Servers are also known to be performance bottlenecks at times too (for fixed cardinality estimates and bringing all the data across the network before processing it). If you were previously relying on it in the transaction of your code, it's possible you can just keep that part outside of the transaction to minimize the lock time of your local tables.

-
Also, I do like the use of the HASHBYTES() function to generate a row hash, that's something I've used in the past. But you may also find it more performant to call it on the row itself by using CONCAT() across every column with a safe column separator (e.g. double pipe ||) as the parameter to it as opposed to using it around a subquery that leverages XML.

-
That being said, as noted by Erik Darling, HASHBYTES() itself is sometimes susceptible to performance issues at scale. If it is your bottleneck here, you can try materializing the results of the function in a computed column or indexed view (it is deterministic), or use an alternative method, such as CLR, as mentioned in the linked post.

Context

StackExchange Database Administrators Q#312246, answer score: 8

Revisions (0)

No revisions yet.