HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Understanding SQL Server statistics better

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
understandingsqlbetterstatisticsserver

Problem

I have a table with 5,000,000 records with the DateOfBirth column spread between 1950 and 2007. The statistics histogram for the index only has two RANGE_HI_KEYs. I feel like the histogram should have more buckets given the quantity of records and selectiveness of the field.

Can anyone explain to me why SQL Server is only using two RANGE_HI_KEYs?

NOTE: I get the same stats breakdown on 2014 and 2016

Table

CREATE TABLE [dbo].[Person](
    [BusinessEntityID] [INT] NOT NULL,
    [PersonType] [NCHAR](2) NOT NULL,
    [NameStyle] [dbo].[NameStyle] NOT NULL,
    [Title] [NVARCHAR](8) NULL,
    [FirstName] [dbo].[Name] NOT NULL,
    [MiddleName] [dbo].[Name] NULL,
    [LastName] [dbo].[Name] NOT NULL,
    [Suffix] [NVARCHAR](10) NULL,
    [EmailPromotion] [INT] NOT NULL,
    [rowguid] [UNIQUEIDENTIFIER] NOT NULL,
    [ModifiedDate] [DATETIME] NOT NULL,
    [DateOfBirth] [DATE] NOT NULL
)


Index

CREATE NONCLUSTERED INDEX [IX_dbo_Person_DateOfBirth] 
ON [dbo].[Person] ( [DateOfBirth] ASC )


Histogram

```
Statistics for INDEX 'IX_dbo_Person_DateOfBirth'.
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Name Updated Rows Rows Sampled Steps Density Average Key Length String Index
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Solution

Very easy. Look for the result of:

select datediff(day,'1/1/1950','12/31/2007');


It will give you your DISTINCT_RANGE_ROWS + 1
That means that SQL Server thinks that all your DOBs are approximately equally distributed across the range with average 236 items per day.
Query your data and see if you have any outlines like: missing day ranges or >1000 DOBs per day.
If you do, than that would be a question to the engine.

Code Snippets

select datediff(day,'1/1/1950','12/31/2007');

Context

StackExchange Database Administrators Q#142208, answer score: 3

Revisions (0)

No revisions yet.