snippetMinor
Distinct Sort Operator with no Order, Distinct, Grouping
Viewed 0 times
distinctorderoperatorgroupingwithsort
Problem
I have two queries that are mostly the same bar a where expression. They are UNION ALL'd together for a complete result set. There is no grouping, nor ordering, nor distinct in the queries. However, directly prior to the final concatenation/select operator and after any joins sit two distinct sorts. I'm a bit confused by these distinct sorts given the aforementioned info.
Can anyone provide insight into why this operator is in this plan?
FWIW, I have rewritten this query, just trying to fill in knowledge gaps/correct misunderstandings.
Thanks
https://www.brentozar.com/pastetheplan/?id=Sy2RJrd7h
Can anyone provide insight into why this operator is in this plan?
FWIW, I have rewritten this query, just trying to fill in knowledge gaps/correct misunderstandings.
Thanks
https://www.brentozar.com/pastetheplan/?id=Sy2RJrd7h
Solution
The query is written as two similar
The execution plan for that branch is more along the following lines (*).
Instead of driving the query by doing the join on
The join cannot just simply be changed to an inner join and left like that because duplicate rows for a StagingKey in
So these are removed by the
The same applies to the other branch of the
(*) - The actual duplicate removal by the
UNION ALL branches - with one of them being as follows.SELECT EventKey,
cast(ms.AdmitDateTime AS DATE) AS StagingAdmitDate,
EventAdmitDate,
ms.DischargeDateTime
FROM #Events AS e
JOIN CM.Staging AS ms
ON ms.UniqueCaseIdentifier = e.UniqueCaseIdentifier
AND ms.HospitalID = e.HospitalID
WHERE EXISTS(SELECT 1
FROM #UpdatedStagingRecords AS i
WHERE ms.StagingKey = i.StagingKey)
AND e.DischargeDateTime IS NULLThe execution plan for that branch is more along the following lines (*).
SELECT DISTINCT EventKey,
cast(ms.AdmitDateTime AS DATE) AS StagingAdmitDate,
EventAdmitDate,
ms.DischargeDateTime
FROM #UpdatedStagingRecords AS i
INNER JOIN CM.Staging AS ms
ON ms.StagingKey = i.StagingKey
INNER JOIN #Events e
ON ms.UniqueCaseIdentifier = e.UniqueCaseIdentifier
AND ms.HospitalID = e.HospitalID
AND e.DischargeDateTime IS NULLInstead of driving the query by doing the join on
#Events (741 rows) and CM.Staging (120907 rows) and then doing a semi join to see if there was a match in #UpdatedStagingRecords it drives the query from #UpdatedStagingRecords itself (2 rows) and does two index seeks in CM.Staging.The join cannot just simply be changed to an inner join and left like that because duplicate rows for a StagingKey in
#UpdatedStagingRecords could lead to duplicate rows in the result that would not be there with the original exists syntax.So these are removed by the
DISTINCT-ification step.The same applies to the other branch of the
UNION too but it is so similar I haven't addressed that specifically.(*) - The actual duplicate removal by the
DISTINCT is not something that can be easily expressed by representing with SQL syntax. It is not actually doing a DISTINCT based on the projected column values. It is doing it based on Bmknnnn column which maps to the physical row identifier from the #Events table.Code Snippets
SELECT EventKey,
cast(ms.AdmitDateTime AS DATE) AS StagingAdmitDate,
EventAdmitDate,
ms.DischargeDateTime
FROM #Events AS e
JOIN CM.Staging AS ms
ON ms.UniqueCaseIdentifier = e.UniqueCaseIdentifier
AND ms.HospitalID = e.HospitalID
WHERE EXISTS(SELECT 1
FROM #UpdatedStagingRecords AS i
WHERE ms.StagingKey = i.StagingKey)
AND e.DischargeDateTime IS NULLSELECT DISTINCT EventKey,
cast(ms.AdmitDateTime AS DATE) AS StagingAdmitDate,
EventAdmitDate,
ms.DischargeDateTime
FROM #UpdatedStagingRecords AS i
INNER JOIN CM.Staging AS ms
ON ms.StagingKey = i.StagingKey
INNER JOIN #Events e
ON ms.UniqueCaseIdentifier = e.UniqueCaseIdentifier
AND ms.HospitalID = e.HospitalID
AND e.DischargeDateTime IS NULLContext
StackExchange Database Administrators Q#326527, answer score: 4
Revisions (0)
No revisions yet.