HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Unique filtered index and surprising estimation

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
uniqueestimationandindexsurprisingfiltered

Problem

When I view the estimated execution plan for this query...

SELECT *
FROM tabela2  x
     LEFT JOIN dbo.tabela1 t with (nolock) on t.Kod = x.Kod


... the Estimated Number of Rows property for [table1] shows 10. Why is this? In the schema (copied below) I have a unique index AKF_tabela1_Kod. Shouldn't this unique index force the row estimation to be 1?

Estimated Execution Plan

Create Database, Tables and Index

-- create database 
USE [master]
GO
DROP DATABASE IF EXISTS my_test;
CREATE DATABASE my_test;
GO
ALTER DATABASE my_test SET AUTO_CREATE_STATISTICS OFF;
ALTER AUTHORIZATION ON DATABASE ::my_test TO sa;
USE my_test
GO

-- create dbo.tabela1

CREATE TABLE dbo.[tabela1] 
    (
    [Id] int NOT NULL identity(1,1),
    [kol1] varchar(20) NOT NULL,
    [Kod] int NULL,  
    CONSTRAINT [PK_tabela1] PRIMARY KEY NONCLUSTERED([id] ASC) WITH (ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON),
    CONSTRAINT [BK_tabela1] UNIQUE CLUSTERED([Kol1] ASC) WITH (ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON),
    );
GO

-- create Index 

CREATE UNIQUE NONCLUSTERED INDEX [AKF_tabela1_Kod] 
    ON dbo.[tabela1]([Kod] ASC) 
    INCLUDE ([id],[kol1]) 
    WHERE ([Kod] IS NOT NULL) 
    WITH (ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, MAXDOP = 0);
GO

-- create dbo.tabela2

IF  OBJECT_ID('dbo.tabela2') IS NULL
  CREATE TABLE tabela2 (Kod int NOT NULL)
GO


Insert Data into Table

-- create data in tabela1
declare @i int=1
    while @i<=100
    begin
      INSERT tabela1 (kol1, kod)
      values (@i, cast(@i+100 as varchar))
      set @i += 1;
    end;
GO

UPDATE STATISTICS tabela1 WITH FULLSCAN;


Here the Execution plan

Solution

the Estimated Number of Rows property for [table1] shows 10. Why is
this? In the schema (copied below) I have a unique index
AKF_tabela1_Kod. Shouldn't this unique index force the row estimation
to be 1?

The statistics are created when creating the unique index, but as a result of the unique index being filtered, the statistics are also filtered.
(where kod is not null).

As a result, the query as it is cannot use these statistics.

Why?

The Index can be used, but the statistics are not a match for the query.

Even if you do a select on kod:

select kod from dbo.tabela1 t 
     where kod is not null  option(recompile)


The statistics are still no match:

If you do a select of the id however:

select id from dbo.tabela1 t 
 where kod is not null  option(recompile)


There is a match:

This seems to be buggy with auto create stats:

ALTER DATABASE my_test SET AUTO_CREATE_STATISTICS ON;

 select kod from dbo.tabela1 t 
 where kod is not null  option(recompile)


This does not create stats for some reason, but is still able to do the correct estimation.

No new stats:

Re running this:

ALTER DATABASE my_test SET AUTO_CREATE_STATISTICS OFF;

 select kod from dbo.tabela1 t 
 where kod is not null  option(recompile)


Gives the same warning:

This seems to be an issue with auto create statistics being off, even though no stats are created.

I am not sure, but an estimate of 10 might be a 'default' value to use, like when using a table variable.

Workarounds:

1

If you create the next index, the result will be 1 estimated rows

CREATE UNIQUE NONCLUSTERED INDEX [AKF_tabela1_Kod2] 
    ON dbo.[tabela1]([Kod] ASC) 
    INCLUDE ([id]) 
    WITH (ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, MAXDOP = 0);
GO


2

Another route you could take is creating your own statistics:

create statistics ST_Tablea1_Kod on dbo.tabela1(kod)


Retry the query (with a new execution plan):

SELECT *
FROM tabela2  x
     LEFT JOIN dbo.tabela1 t with (nolock) on  t.Kod = x.Kod option(recompile)


3

Or ofcourse, enable auto create statistics and rerun the query:

ALTER DATABASE my_test SET AUTO_CREATE_STATISTICS ON;


Update:

The below queries give a stat missing on the first execution, but not on the other two. A statistic was created by auto create stats.

SELECT *
FROM tabela2  x
     LEFT JOIN dbo.tabela1 t with (nolock) on t.Kod = x.Kod option(recompile)
      ALTER DATABASE my_test SET AUTO_CREATE_STATISTICS ON;
         SELECT *
FROM tabela2  x
     LEFT JOIN dbo.tabela1 t with (nolock) on t.Kod = x.Kod option(recompile)
           ALTER DATABASE my_test SET AUTO_CREATE_STATISTICS OFF;

             SELECT *
FROM tabela2  x
     LEFT JOIN dbo.tabela1 t with (nolock) on t.Kod = x.Kod option(recompile)


Dropping the stats and retrying with a select:

drop statistics  dbo.tabela1.[_WA_Sys_00000003_0EA330E9]

 select kod from dbo.tabela1 t 
 where kod is not null  option(recompile)

     ALTER DATABASE my_test SET AUTO_CREATE_STATISTICS ON;

 select kod from dbo.tabela1 t 
 where kod is not null  option(recompile)

     ALTER DATABASE my_test SET AUTO_CREATE_STATISTICS OFF;

 select kod from dbo.tabela1 t 
 where kod is not null  option(recompile)


Does not generate stats, and gives the same missing index hints on the first and third execution:

I thought that this maybe was because of the optimization level being trivial. But this was not the case.

StatementOptmLevel="TRIVIAL"


Non trivial optimization level:

select kod from dbo.tabela1 t 
 where kod is not null 
 group by kod  option(recompile)

     ALTER DATABASE my_test SET AUTO_CREATE_STATISTICS ON;

 select kod from dbo.tabela1 t 
 where kod is not null     group by kod   option(recompile)

     ALTER DATABASE my_test SET AUTO_CREATE_STATISTICS OFF;

 select kod from dbo.tabela1 t 
 where kod is not null    group by kod   option(recompile)


Result:

Query #2 with automatic stat creation:

(I added 100 rows for a total of 200)

Query #3 , after putting automatic stat creation off again:

(180 estimated rows)

As a conclusion, this should have something to do with automatic stat creation being disabled.
My guess is on some sort of temporary stats or the way the plan is created.
I checked if i could find any 'temp'stats that are created at runtime,
but i did not find anything.
(select * from sys.stats where is_temporary = 1)

Code Snippets

select kod from dbo.tabela1 t 
     where kod is not null  option(recompile)
select id from dbo.tabela1 t 
 where kod is not null  option(recompile)
ALTER DATABASE my_test SET AUTO_CREATE_STATISTICS ON;

 select kod from dbo.tabela1 t 
 where kod is not null  option(recompile)
ALTER DATABASE my_test SET AUTO_CREATE_STATISTICS OFF;

 select kod from dbo.tabela1 t 
 where kod is not null  option(recompile)
CREATE UNIQUE NONCLUSTERED INDEX [AKF_tabela1_Kod2] 
    ON dbo.[tabela1]([Kod] ASC) 
    INCLUDE ([id]) 
    WITH (ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, MAXDOP = 0);
GO

Context

StackExchange Database Administrators Q#224640, answer score: 4

Revisions (0)

No revisions yet.