principlesqlMinor
What is most efficient way to compare two large result sets in SQL Server 2012
Viewed 0 times
result2012whatsqlefficientwaysetstwolargeserver
Problem
Current advice for the most efficient way to compare two large result/row sets seems to be to use the
EXCEPT operator. This self contained SQL script below gets very inefficient as row sizes increase (change @last values). I have tried to find unique entries in a combined table but with no improvement.DECLARE @first AS INT, @step AS INT, @last AS INT;
-- This script is comparing two record sets using EXCEPT
-- I want to find additions from OLD to NEW
-- As number of rows increase performance gets terrible
-- I don't have to use two tables. I could use one combined table but I want the same result as quickly as possible
-- Compare 100 to 110 rows - 0 seconds
-- Compare 1000 to 1010 rows - 1 seconds
-- Compare 10000 to 10010 rows - 16 seconds
-- Compare 100000 to 100010 rows - ABORT after 8 minutes (tables are populated in 18 seconds)
DECLARE @temptableOLD TABLE ([Result1] int);
SET @step = 1; SET @first = 1; SET @last = 100000
WHILE(@first <= @last) BEGIN INSERT INTO @temptableOLD VALUES(@first) SET @first += @step END
DECLARE @temptableNEW TABLE ([Result1] int);
SET @step = 1; SET @first = 1; SET @last = 100010
WHILE(@first <= @last) BEGIN INSERT INTO @temptableNEW VALUES(@first) SET @first += @step END
select * from @temptableNEW
except
select * from @temptableOLDSolution
EXCEPT implies a DISTINCT operation.I would use
NOT EXISTS if this is not actually required.However the problem you are having is likely that you are getting a nested loops on an unindexed table due to the poor cardinality estimates associated with table variables.
select * from @temptableNEW
except
select * from @temptableOLD
OPTION (RECOMPILE)Will be able to take account that the tables have 100K rows each and give a different plan.
In SQL Server 2012 you can only add indexes to table variables via constraints. If the values are unique you could use
DECLARE @temptableOLD TABLE ([Result1] int UNIQUE CLUSTERED);to add an index. If done on both tables the plan (after the recompile hint is added) will likely use a merge join instead. Without any indexes I would expect a hash join.
Code Snippets
select * from @temptableNEW
except
select * from @temptableOLD
OPTION (RECOMPILE)DECLARE @temptableOLD TABLE ([Result1] int UNIQUE CLUSTERED);Context
StackExchange Database Administrators Q#124630, answer score: 8
Revisions (0)
No revisions yet.