patternsqlMinor
STATISTICS IO difference for heap and clustered index
Viewed 0 times
clusteredheapdifferencestatisticsforandindex
Problem
Assume we have two tables with data:
By inspecting their storage statistics
one can see that both occupy the same number of pages:
In this case why
show that scanning the clustered index compared to scanning the heap takes one extra logical read?
What this extra logical read is?
P.S.
My question is not about "clustered index vs heap performance" or query tuning. I'm trying to better understand things involved into
The example is for SqlServer 2014
Microsoft SQL Server 2014 (SP2) (KB3171021) - 12.0.5000.0 (X64)
I also tried it on SqlServer 200
create table heap (value int);
create table clust (value int primary key);
insert into heap values (1);
insert into clust values (1);By inspecting their storage statistics
select obj.name, st.alloc_unit_type_desc, st.index_level, st.page_count
from (values ('heap'), ('clust')) obj(name)
join sys.indexes ix on ix.object_id = object_id(obj.name)
cross apply sys.dm_db_index_physical_stats(
db_id(), ix.object_id, ix.index_id, NULL, 'DETAILED') st;
select obj.name, au.total_pages, au.used_pages, au.data_pages
from (values ('heap'), ('clust')) obj(name)
join sys.indexes ix on ix.object_id = object_id(obj.name)
join sys.partitions p on p.object_id = ix.object_id and p.index_id = ix.index_id
join sys.allocation_units au on au.container_id = p.partition_id;one can see that both occupy the same number of pages:
name alloc_unit_type_desc index_level page_count
----- -------------------- ----------- ----------
heap IN_ROW_DATA 0 1
clust IN_ROW_DATA 0 1
name total_pages used_pages data_pages
----- ------------ ----------- -----------
heap 2 2 1
clust 2 2 1In this case why
statistics ioset nocount on;
set statistics io on;
declare @cnt int;
select @cnt = count(1) from heap;
select @cnt = count(1) from clust;
set statistics io off;show that scanning the clustered index compared to scanning the heap takes one extra logical read?
Table 'heap'. Scan count 1, logical reads 1...
Table 'clust'. Scan count 1, logical reads 2...What this extra logical read is?
P.S.
My question is not about "clustered index vs heap performance" or query tuning. I'm trying to better understand things involved into
STATISTICS IO reported for clustered index scan.The example is for SqlServer 2014
Microsoft SQL Server 2014 (SP2) (KB3171021) - 12.0.5000.0 (X64)
I also tried it on SqlServer 200
Solution
Accessing either a Clustered Index or Non-Clustered Index requires traversing that b-tree structure. For some reason, on scan operations there seems to be one extra logical read. In your case you have a single page, which is the 2 logical reads. If you had enough rows to fill up enough data pages that would in turn require another level within the b-tree index structure, then you would see an additional logical read.
Heaps, by definition, have no index (b-tree) structure. Since you have one data page, the operations only need that one logical read. In such a simplistic example it would appear to be less work than the Clustered Index approach, but as soon as you get a few more data pages then you will start to see a difference since the b-tree structure will allow for going directly to appropriate data pages while the Heap still has to check all of the pages.
For example, I have a test table with a structure of:
It has 767,968 rows in it via:
I copied it to a new table that is the same structure and data, but missing the two constraints (i.e. no Clustered Index) using the following:
The following queries:
show that
The following queries:
shows that
I then rebuilt the tables via:
Running the
Running the
Now, let's find one specific row:
The Heap still takes 3135 logical reads. But the Clustered Index takes a mere 3 logical reads: 1 for the root index page, 1 for the next level index page, and 1 for the leaf level / data page.
Now let's force a scan as we look for a single row:
Here we get the same logical reads that the
Heaps, by definition, have no index (b-tree) structure. Since you have one data page, the operations only need that one logical read. In such a simplistic example it would appear to be less work than the Clustered Index approach, but as soon as you get a few more data pages then you will start to see a difference since the b-tree structure will allow for going directly to appropriate data pages while the Heap still has to check all of the pages.
For example, I have a test table with a structure of:
CREATE TABLE [dbo].[GuidPkAsUI](
[ID] [uniqueidentifier] NOT NULL CONSTRAINT [PK_GuidPkAsUI] PRIMARY KEY CLUSTERED,
[InsertTime] [datetime] NOT NULL
CONSTRAINT [DF_GuidPkAsUI_InsertTime] DEFAULT (getdate()),
);It has 767,968 rows in it via:
INSERT INTO [dbo].[GuidPkAsUI] ([ID])
SELECT TOP (767968) NEWID()
FROM master.sys.all_columns ac1
CROSS JOIN master.sys.objects so1;I copied it to a new table that is the same structure and data, but missing the two constraints (i.e. no Clustered Index) using the following:
SELECT *
INTO dbo.GuidPkAsUIheap
FROM dbo.GuidPkAsUI;The following queries:
SELECT * FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID(N'dbo.GuidPkAsUIheap'),
0, NULL, 'DETAILED');
SELECT * FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID(N'dbo.GuidPkAsUI'),
1, NULL, 'DETAILED');show that
GuidPkAsUIheap has 1 level (with 3135 data pages) and GuidPkAsUI has 3 levels (with 3135 data pages, 20 index pages on one level -- intermediate, and 1 index page on another level -- root, totaling 3156 pages).The following queries:
SET STATISTICS IO ON;
SELECT COUNT(1) FROM dbo.GuidPkAsUIheap;
SET STATISTICS IO OFF;
-- 3135
SET STATISTICS IO ON;
SELECT COUNT(1) FROM dbo.GuidPkAsUI;
SET STATISTICS IO OFF;
-- 3157shows that
GuidPkAsUIheap requires 3135 logical reads (the number of data pages) while GuidPkAsUI requires 3157 logical reads (the number of data and index pages plus one). So here the logical reads for the Clustered Index are still higher than for the Heap.I then rebuilt the tables via:
ALTER TABLE [dbo].[GuidPkAsUIheap] REBUILD;
ALTER TABLE [dbo].[GuidPkAsUI] REBUILD WITH (FILLFACTOR = 100);Running the
SELECT * FROM sys.dm_db_index_physical_stats queries above again shows the Heap to be the same but the Clustered Index now has only 10 intermediate index pages instead of 20.Running the
SELECT COUNT(1) queries above again shows the same 3135 logical reads for the Heap and 3147 logical reads for the Clustered Index (all data and index pages plus one).Now, let's find one specific row:
SET STATISTICS IO ON;
SELECT * FROM dbo.GuidPkAsUIheap WHERE [ID] = '93359759-193F-4CBF-B9F6-738475F8488E';
SET STATISTICS IO OFF;
-- 3135
SET STATISTICS IO ON;
SELECT * FROM dbo.GuidPkAsUI WHERE [ID] = '93359759-193F-4CBF-B9F6-738475F8488E';
SET STATISTICS IO OFF;
-- 3The Heap still takes 3135 logical reads. But the Clustered Index takes a mere 3 logical reads: 1 for the root index page, 1 for the next level index page, and 1 for the leaf level / data page.
Now let's force a scan as we look for a single row:
SET STATISTICS IO ON;
SELECT * FROM dbo.GuidPkAsUIheap WHERE CONVERT(CHAR(36), [ID]) = '98331062-8BF3-4FAE-98B4-204D0DE06FE1';
SET STATISTICS IO OFF;
-- 3135
SET STATISTICS IO ON;
SELECT * FROM dbo.GuidPkAsUI WHERE CONVERT(CHAR(36), [ID]) = '98331062-8BF3-4FAE-98B4-204D0DE06FE1';
SET STATISTICS IO OFF;
-- 3147Here we get the same logical reads that the
COUNT(1) queries get.Code Snippets
CREATE TABLE [dbo].[GuidPkAsUI](
[ID] [uniqueidentifier] NOT NULL CONSTRAINT [PK_GuidPkAsUI] PRIMARY KEY CLUSTERED,
[InsertTime] [datetime] NOT NULL
CONSTRAINT [DF_GuidPkAsUI_InsertTime] DEFAULT (getdate()),
);INSERT INTO [dbo].[GuidPkAsUI] ([ID])
SELECT TOP (767968) NEWID()
FROM master.sys.all_columns ac1
CROSS JOIN master.sys.objects so1;SELECT *
INTO dbo.GuidPkAsUIheap
FROM dbo.GuidPkAsUI;SELECT * FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID(N'dbo.GuidPkAsUIheap'),
0, NULL, 'DETAILED');
SELECT * FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID(N'dbo.GuidPkAsUI'),
1, NULL, 'DETAILED');SET STATISTICS IO ON;
SELECT COUNT(1) FROM dbo.GuidPkAsUIheap;
SET STATISTICS IO OFF;
-- 3135
SET STATISTICS IO ON;
SELECT COUNT(1) FROM dbo.GuidPkAsUI;
SET STATISTICS IO OFF;
-- 3157Context
StackExchange Database Administrators Q#152844, answer score: 5
Revisions (0)
No revisions yet.