patternsqlMajor
Hash keys probe and residual
Viewed 0 times
residualhashkeysprobeand
Problem
Say, we have a query like this:
Assuming the above query uses a Hash Join and has a residual, the probe key will be
But while going through another example, I could see both the probe and the residual to be the same column. Below is an elaboration on what I am trying to say:
Query:
Execution plan, with probe and residual highlighted:
Test data:
Question:
How can a probe and a residual be the same column? Why can't SQL Server use the probe column only? Why does it have to use the same column as a residual to filter rows again?
References for test data:
select a.*,b.*
from
a join b
on a.col1=b.col1
and len(a.col1)=10Assuming the above query uses a Hash Join and has a residual, the probe key will be
col1 and the residual will be len(a.col1)=10.But while going through another example, I could see both the probe and the residual to be the same column. Below is an elaboration on what I am trying to say:
Query:
select *
from T1 join T2 on T1.a = T2.aExecution plan, with probe and residual highlighted:
Test data:
create table T1 (a int, b int, x char(200))
create table T2 (a int, b int, x char(200))
set nocount on
declare @i int
set @i = 0
while @i < 1000
begin
insert T1 values (@i * 2, @i * 5, @i)
set @i = @i + 1
end
declare @i int
set @i = 0
while @i < 10000
begin
insert T2 values (@i * 3, @i * 7, @i)
set @i = @i + 1
endQuestion:
How can a probe and a residual be the same column? Why can't SQL Server use the probe column only? Why does it have to use the same column as a residual to filter rows again?
References for test data:
- Hash Join (Craig Freedman's SQL Server Blog)
Solution
If the join is on a single column typed as
Otherwise, you will see a residual as items in the hash bucket are tested for a match, not just a hash function match.
Your test does not specify
More information in my post Join Performance, Implicit Conversions, and Residuals and Hash Join Execution Internals by Dmitry Pilugin.
* Other qualifying types are bit, smalldatetime, smallmoney, and (var)char(n) for n = 1 and binary collation
tinyint, smallint, or integer* and if both columns are constrained to be NOT NULL, the hash function is 'perfect' – meaning there is no chance of a hash collision, and the query processor does not have to check the values again to ensure they really match.Otherwise, you will see a residual as items in the hash bucket are tested for a match, not just a hash function match.
Your test does not specify
NULL or NOT NULL for the columns (a bad practice, by the way), so it appears you are using a database where NULL is the default.More information in my post Join Performance, Implicit Conversions, and Residuals and Hash Join Execution Internals by Dmitry Pilugin.
* Other qualifying types are bit, smalldatetime, smallmoney, and (var)char(n) for n = 1 and binary collation
Context
StackExchange Database Administrators Q#182917, answer score: 24
Revisions (0)
No revisions yet.