HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlModerate

DISTINCT not reducing two equal values to one

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
distinctequalreducingtwoonevaluesnot

Problem

Can anyone explain the below situation, where two seemingly equal values are not reduced by DISTINCT?

The query above is SELECT DISTINCT name FROM master.sys.dm_os_spinlock_stats where name = 'SBS_UCS_DISPATCH';

The equivalent SELECT name FROM master.sys.dm_os_spinlock_stats where name = 'SBS_UCS_DISPATCH' GROUP BY name; also does the same, and adding HAVING COUNT(1) > 1 does not yield the rows.

@@VERSION is Microsoft SQL Server 2019 (RTM-CU13) (KB5005679) - 15.0.4178.1 (X64) Sep 23 2021 16:47:49 Copyright (C) 2019 Microsoft Corporation Enterprise Edition: Core-based Licensing (64-bit) on Windows Server 2016 Standard 10.0 (Build 14393: )

Solution

It looks like the DISTINCT gets optimised out during simplification.

select distinct name 
FROM master.sys.dm_os_spinlock_stats 
OPTION (QUERYTRACEON 8606,QUERYTRACEON 3604)


Gives

*** Input Tree: ***
        LogOp_GbAgg OUT(QCOL: DM_OS_SPINLOCKSTATS.name,) BY(QCOL: DM_OS_SPINLOCKSTATS.name,)

            LogOp_Project

                LogOp_Project QCOL: DM_OS_SPINLOCKSTATS.name

                    LogOp_Project

                        LogOp_StreamingTabUdfRESULT(QCOL: DM_OS_SPINLOCKSTATS.name QCOL: DM_OS_SPINLOCKSTATS.collisions QCOL: DM_OS_SPINLOCKSTATS.spins QCOL: DM_OS_SPINLOCKSTATS.spins_per_collision QCOL: DM_OS_SPINLOCKSTATS.sleep_time QCOL: DM_OS_SPINLOCKSTATS.backoffs)

                        AncOp_PrjList 

                    AncOp_PrjList 

                AncOp_PrjList 

            AncOp_PrjList 

*******************


Then the GbAgg disappears in the Simplified Tree

*** Simplified Tree: ***
        LogOp_StreamingTabUdfRESULT(QCOL: DM_OS_SPINLOCKSTATS.name QCOL: DM_OS_SPINLOCKSTATS.collisions QCOL: DM_OS_SPINLOCKSTATS.spins QCOL: DM_OS_SPINLOCKSTATS.spins_per_collision QCOL: DM_OS_SPINLOCKSTATS.sleep_time QCOL: DM_OS_SPINLOCKSTATS.backoffs)


I assume that there is some metadata indicating name is unique despite the query results to the contrary but I can't see how to prove that.

One way of working around this is the below.

SELECT DISTINCT name COLLATE Latin1_General_100_CI_AS
FROM master.sys.dm_os_spinlock_stats 
WHERE name COLLATE Latin1_General_100_CI_AS = 'SBS_UCS_DISPATCH'


In general even if name was unique under one collation then it could still have duplicates with different collate semantics. The original column collation for me is Latin1_General_CI_AS so this doesn't change the case insensitive accent sensitive nature but it is enough to preserve the GbAgg in the plan.

SELECT ca.name, 
       COUNT(*) AS Count,  
       SERVERPROPERTY('ProductVersion') AS [version]
FROM master.sys.dm_os_spinlock_stats 
CROSS APPLY (VALUES(name COLLATE Latin1_General_100_BIN)) ca(name)
GROUP BY ca.name
HAVING COUNT(*) > 1


Returns the following on my dev instance

name
Count
version

LOGPOOL_FREEBUFMGR
2
16.0.1050.5

SBS_UCS_DISPATCH
2
16.0.1050.5

So SBS_UCS_DISPATCH is not alone in being duplicated.

For me the values in the other columns are all 0 for these. So I'm unclear if the entire rows will be duplicated or not in the event that these spin locks are encountered - or whether they would need to be aggregated.

The affected types aren't ones that are documented other than "for internal use" anyway though.

Code Snippets

select distinct name 
FROM master.sys.dm_os_spinlock_stats 
OPTION (QUERYTRACEON 8606,QUERYTRACEON 3604)
*** Input Tree: ***
        LogOp_GbAgg OUT(QCOL: DM_OS_SPINLOCKSTATS.name,) BY(QCOL: DM_OS_SPINLOCKSTATS.name,)

            LogOp_Project

                LogOp_Project QCOL: DM_OS_SPINLOCKSTATS.name

                    LogOp_Project

                        LogOp_StreamingTabUdfRESULT(QCOL: DM_OS_SPINLOCKSTATS.name QCOL: DM_OS_SPINLOCKSTATS.collisions QCOL: DM_OS_SPINLOCKSTATS.spins QCOL: DM_OS_SPINLOCKSTATS.spins_per_collision QCOL: DM_OS_SPINLOCKSTATS.sleep_time QCOL: DM_OS_SPINLOCKSTATS.backoffs)


                        AncOp_PrjList 

                    AncOp_PrjList 

                AncOp_PrjList 

            AncOp_PrjList 

*******************
*** Simplified Tree: ***
        LogOp_StreamingTabUdfRESULT(QCOL: DM_OS_SPINLOCKSTATS.name QCOL: DM_OS_SPINLOCKSTATS.collisions QCOL: DM_OS_SPINLOCKSTATS.spins QCOL: DM_OS_SPINLOCKSTATS.spins_per_collision QCOL: DM_OS_SPINLOCKSTATS.sleep_time QCOL: DM_OS_SPINLOCKSTATS.backoffs)
SELECT DISTINCT name COLLATE Latin1_General_100_CI_AS
FROM master.sys.dm_os_spinlock_stats 
WHERE name COLLATE Latin1_General_100_CI_AS = 'SBS_UCS_DISPATCH'
SELECT ca.name, 
       COUNT(*) AS Count,  
       SERVERPROPERTY('ProductVersion') AS [version]
FROM master.sys.dm_os_spinlock_stats 
CROSS APPLY (VALUES(name COLLATE Latin1_General_100_BIN)) ca(name)
GROUP BY ca.name
HAVING COUNT(*) > 1

Context

StackExchange Database Administrators Q#329943, answer score: 16

Revisions (0)

No revisions yet.