patternsqlMajor
Efficient INSERT INTO a Table With Clustered Index
Viewed 0 times
clusteredinsertwithintoefficientindextable
Problem
I have a SQL statement that inserts rows into a table with a clustered index on the column TRACKING_NUMBER.
E.G.:
My question is - does it help to use an ORDER BY clause in the SELECT statement for the clustered index column, or would any gain acheived be negated by the extra sort required for the ORDER BY clause?
E.G.:
INSERT INTO TABL_NAME (TRACKING_NUMBER, COLB, COLC)
SELECT TRACKING_NUMBER, COL_B, COL_C
FROM STAGING_TABLEMy question is - does it help to use an ORDER BY clause in the SELECT statement for the clustered index column, or would any gain acheived be negated by the extra sort required for the ORDER BY clause?
Solution
As the other answers already indicate SQL Server may or may not explicitly ensure that the rows are sorted in clustered index order prior to the
This is dependant upon whether or not the clustered index operator in the plan has the
If you find that SQL Server is underestimating this for whatever reason you might benefit from adding an explicit
Example:
Shows that
But for
Conversely sometimes you might want to force SQL Server to underestimate the row count when you know the data is already pre-sorted and wish to avoid an unnecessary sort. One notable example is when inserting a large number of rows into a table with a
SQL Server will then estimate that 100 rows will be inserted irrespective of the size of
insert.This is dependant upon whether or not the clustered index operator in the plan has the
DMLRequestSort property set (which in turn depends upon the estimated number of rows that are inserted).If you find that SQL Server is underestimating this for whatever reason you might benefit from adding an explicit
ORDER BY to the SELECT query to minimize page splits and ensuing fragmentation from the INSERT operation Example:
use tempdb;
GO
CREATE TABLE T(N INT PRIMARY KEY,Filler char(2000))
CREATE TABLE T2(N INT PRIMARY KEY,Filler char(2000))
GO
DECLARE @T TABLE (U UNIQUEIDENTIFIER PRIMARY KEY DEFAULT NEWID(),N int)
INSERT INTO @T(N)
SELECT number
FROM master..spt_values
WHERE type = 'P' AND number BETWEEN 0 AND 499
/*Estimated row count wrong as inserting from table variable*/
INSERT INTO T(N)
SELECT T1.N*1000 + T2.N
FROM @T T1, @T T2
/*Same operation using explicit sort*/
INSERT INTO T2(N)
SELECT T1.N*1000 + T2.N
FROM @T T1, @T T2
ORDER BY T1.N*1000 + T2.N
SELECT avg_fragmentation_in_percent,
fragment_count,
page_count,
avg_page_space_used_in_percent,
record_count
FROM sys.dm_db_index_physical_stats(2, OBJECT_ID('T'), NULL, NULL, 'DETAILED')
;
SELECT avg_fragmentation_in_percent,
fragment_count,
page_count,
avg_page_space_used_in_percent,
record_count
FROM sys.dm_db_index_physical_stats(2, OBJECT_ID('T2'), NULL, NULL, 'DETAILED')
;Shows that
T is massively fragmentedavg_fragmentation_in_percent fragment_count page_count avg_page_space_used_in_percent record_count
---------------------------- -------------------- -------------------- ------------------------------ --------------------
99.3116118225536 92535 92535 67.1668272794663 250000
99.5 200 200 74.2868173956017 92535
0 1 1 32.0978502594514 200But for
T2 fragmentation is minimalavg_fragmentation_in_percent fragment_count page_count avg_page_space_used_in_percent record_count
---------------------------- -------------------- -------------------- ------------------------------ --------------------
0.376 262 62500 99.456387447492 250000
2.1551724137931 232 232 43.2438349394613 62500
0 1 1 37.2374598468001 232Conversely sometimes you might want to force SQL Server to underestimate the row count when you know the data is already pre-sorted and wish to avoid an unnecessary sort. One notable example is when inserting a large number of rows into a table with a
newsequentialid clustered index key. In versions of SQL Server prior to Denali SQL Server adds an unnecessary and potentially expensive sort operation. This can be avoided by DECLARE @var INT =2147483647
INSERT INTO Foo
SELECT TOP (@var) *
FROM BarSQL Server will then estimate that 100 rows will be inserted irrespective of the size of
Bar which is below the threshold at which a sort is added to the plan. However as pointed out in the comments below this does mean that the insert will unfortunately not be able to take advantage of minimal logging.Code Snippets
use tempdb;
GO
CREATE TABLE T(N INT PRIMARY KEY,Filler char(2000))
CREATE TABLE T2(N INT PRIMARY KEY,Filler char(2000))
GO
DECLARE @T TABLE (U UNIQUEIDENTIFIER PRIMARY KEY DEFAULT NEWID(),N int)
INSERT INTO @T(N)
SELECT number
FROM master..spt_values
WHERE type = 'P' AND number BETWEEN 0 AND 499
/*Estimated row count wrong as inserting from table variable*/
INSERT INTO T(N)
SELECT T1.N*1000 + T2.N
FROM @T T1, @T T2
/*Same operation using explicit sort*/
INSERT INTO T2(N)
SELECT T1.N*1000 + T2.N
FROM @T T1, @T T2
ORDER BY T1.N*1000 + T2.N
SELECT avg_fragmentation_in_percent,
fragment_count,
page_count,
avg_page_space_used_in_percent,
record_count
FROM sys.dm_db_index_physical_stats(2, OBJECT_ID('T'), NULL, NULL, 'DETAILED')
;
SELECT avg_fragmentation_in_percent,
fragment_count,
page_count,
avg_page_space_used_in_percent,
record_count
FROM sys.dm_db_index_physical_stats(2, OBJECT_ID('T2'), NULL, NULL, 'DETAILED')
;avg_fragmentation_in_percent fragment_count page_count avg_page_space_used_in_percent record_count
---------------------------- -------------------- -------------------- ------------------------------ --------------------
99.3116118225536 92535 92535 67.1668272794663 250000
99.5 200 200 74.2868173956017 92535
0 1 1 32.0978502594514 200avg_fragmentation_in_percent fragment_count page_count avg_page_space_used_in_percent record_count
---------------------------- -------------------- -------------------- ------------------------------ --------------------
0.376 262 62500 99.456387447492 250000
2.1551724137931 232 232 43.2438349394613 62500
0 1 1 37.2374598468001 232DECLARE @var INT =2147483647
INSERT INTO Foo
SELECT TOP (@var) *
FROM BarContext
StackExchange Database Administrators Q#7350, answer score: 21
Revisions (0)
No revisions yet.