patternsqlModerate
Unique index updates and statistics row modification counters
Viewed 0 times
uniquecountersstatisticsmodificationandindexrowupdates
Problem
Given the following table, unique clustered index, and statistics:
The statistics row modification counters obviously show zero before any updates:
Incrementing each
Uses the execution plan:
It produces the following statistics modification counters:
Questions
CREATE TABLE dbo.Banana
(
pk integer NOT NULL,
c1 char(1) NOT NULL,
c2 char(1) NOT NULL
);
CREATE UNIQUE CLUSTERED INDEX pk ON dbo.Banana (pk);
CREATE STATISTICS c1 ON dbo.Banana (c1);
CREATE STATISTICS c2 ON dbo.Banana (c2);
INSERT dbo.Banana
(pk, c1, c2)
VALUES
(1, 'A', 'W'),
(2, 'B', 'X'),
(3, 'C', 'Y'),
(4, 'D', 'Z');
-- Populate statistics
UPDATE STATISTICS dbo.Banana;The statistics row modification counters obviously show zero before any updates:
-- Show statistics modification counters
SELECT
stats_name = S.[name],
DDSP.stats_id,
DDSP.[rows],
DDSP.modification_counter
FROM sys.stats AS S
CROSS APPLY sys.dm_db_stats_properties(S.object_id, S.stats_id) AS DDSP
WHERE
S.[object_id] = OBJECT_ID(N'dbo.Banana', N'U');Incrementing each
pk column value by one for every row:-- Increment pk in every row
UPDATE dbo.Banana
SET pk += 1;Uses the execution plan:
It produces the following statistics modification counters:
Questions
- What do the Split, Sort, and Collapse operators do?
- Why do the
pkstats show 2 modifications, butc1andc2show 5?
Solution
SQL Server always uses the Split, Sort, and Collapse combination of operators when maintaining a unique index as part of an update that affects (or might affect) more than one row.
Working through the example in the question, we could write the update as a separate single-row update for each of the four rows present:
The problem is that the first statement would fail, since it changes
Split
The first step is to Split each update statement into a delete followed by an insert:
The Split operator adds an action code column to the stream (here labelled Act1007):
The action code is 1 for an update, 3 for a delete, and 4 for an insert.
Sort
The split statements above would still produce a false transient unique key violation, so the next step is to sort the statements by the keys of the unique index being updated (
Collapse
The preceding stage is enough to guarantee the avoidance of false uniqueness violations in all cases. As an optimization, Collapse combines adjacent deletes and inserts on the same key value into an update:
The delete/insert pairs for
The Collapse operator groups rows by the key columns, and updates the action code to reflect the collapse outcome:
Clustered Index Update
This operator is labelled as an Update, but it is capable of inserts, updates, and deletes. Which action is taken by the Clustered Index Update per row is determined by the value of the action code in that row. The operator has an Action property to reflect this mode of operation:
Row modification counters
Note that the three updates above do not modify the key(s) of the unique index being maintained. In effect, we have transformed updates to the key columns in the index into updates of the non-key columns (
An insert or a delete affects every single column in the row, so statistics associated with every column will have their modification counters incremented. For update(s), only statistics with any of the updated columns as the leading column have their modification counters incremented (even if the value is unchanged).
The statistics row modification counters therefore show 2 changes to
Note: Only changes applied to the base object (heap or clustered index) affect statistics row modification counters. Non-clustered indexes are secondary structures, reflecting changes already made to the base object. They do not affect statistics row modification counters at all.
If an object has multiple unique indexes, a separate Split, Sort, Collapse combination is used to organize the updates to each. SQL Server optimizes this case for nonclustered indexes by saving the result of the Split to an Eager Table Spool, then replaying that set for each uniqu
Working through the example in the question, we could write the update as a separate single-row update for each of the four rows present:
-- Per row updates
UPDATE dbo.Banana SET pk = 2 WHERE pk = 1;
UPDATE dbo.Banana SET pk = 3 WHERE pk = 2;
UPDATE dbo.Banana SET pk = 4 WHERE pk = 3;
UPDATE dbo.Banana SET pk = 5 WHERE pk = 4;The problem is that the first statement would fail, since it changes
pk from 1 to 2, and there is already a row where pk = 2. The SQL Server storage engine requires that unique indexes remain unique at every stage of processing, even within a single statement. This is the problem solved by Split, Sort, and Collapse.Split
The first step is to Split each update statement into a delete followed by an insert:
DELETE dbo.Banana WHERE pk = 1;
INSERT dbo.Banana (pk, c1, c2) VALUES (2, 'A', 'W');
DELETE dbo.Banana WHERE pk = 2;
INSERT dbo.Banana (pk, c1, c2) VALUES (3, 'B', 'X');
DELETE dbo.Banana WHERE pk = 3;
INSERT dbo.Banana (pk, c1, c2) VALUES (4, 'C', 'Y');
DELETE dbo.Banana WHERE pk = 4;
INSERT dbo.Banana (pk, c1, c2) VALUES (5, 'D', 'Z');The Split operator adds an action code column to the stream (here labelled Act1007):
The action code is 1 for an update, 3 for a delete, and 4 for an insert.
Sort
The split statements above would still produce a false transient unique key violation, so the next step is to sort the statements by the keys of the unique index being updated (
pk in this case), then by the action code. For this example, this simply means that deletes (3) on the same key are ordered before inserts (4). The resulting order is:-- Sort (pk, action)
DELETE dbo.Banana WHERE pk = 1;
DELETE dbo.Banana WHERE pk = 2;
INSERT dbo.Banana (pk, c1, c2) VALUES (2, 'A', 'W');
DELETE dbo.Banana WHERE pk = 3;
INSERT dbo.Banana (pk, c1, c2) VALUES (3, 'B', 'X');
DELETE dbo.Banana WHERE pk = 4;
INSERT dbo.Banana (pk, c1, c2) VALUES (4, 'C', 'Y');
INSERT dbo.Banana (pk, c1, c2) VALUES (5, 'D', 'Z');Collapse
The preceding stage is enough to guarantee the avoidance of false uniqueness violations in all cases. As an optimization, Collapse combines adjacent deletes and inserts on the same key value into an update:
-- Collapse (pk)
DELETE dbo.Banana WHERE pk = 1;
UPDATE dbo.Banana SET c1 = 'A', c2 = 'W' WHERE pk = 2;
UPDATE dbo.Banana SET c1 = 'B', c2 = 'X' WHERE pk = 3;
UPDATE dbo.Banana SET c1 = 'C', c2 = 'Y' WHERE pk = 4;
INSERT dbo.Banana (pk, c1, c2) VALUES (5, 'D', 'Z');The delete/insert pairs for
pk values 2, 3, and 4 have been combined into an update, leaving a single delete on pk = 1, and an insert for pk = 5.The Collapse operator groups rows by the key columns, and updates the action code to reflect the collapse outcome:
Clustered Index Update
This operator is labelled as an Update, but it is capable of inserts, updates, and deletes. Which action is taken by the Clustered Index Update per row is determined by the value of the action code in that row. The operator has an Action property to reflect this mode of operation:
Row modification counters
Note that the three updates above do not modify the key(s) of the unique index being maintained. In effect, we have transformed updates to the key columns in the index into updates of the non-key columns (
c1 and c2), plus a delete and an insert. Neither a delete nor an insert can cause a false unique-key violation.An insert or a delete affects every single column in the row, so statistics associated with every column will have their modification counters incremented. For update(s), only statistics with any of the updated columns as the leading column have their modification counters incremented (even if the value is unchanged).
The statistics row modification counters therefore show 2 changes to
pk, and 5 for c1 and c2:-- Collapse (pk)
DELETE dbo.Banana WHERE pk = 1; -- All columns modified
UPDATE dbo.Banana SET c1 = 'A', c2 = 'W' WHERE pk = 2; -- c1 and c2 modified
UPDATE dbo.Banana SET c1 = 'B', c2 = 'X' WHERE pk = 3; -- c1 and c2 modified
UPDATE dbo.Banana SET c1 = 'C', c2 = 'Y' WHERE pk = 4; -- c1 and c2 modified
INSERT dbo.Banana (pk, c1, c2) VALUES (5, 'D', 'Z'); -- All columns modifiedNote: Only changes applied to the base object (heap or clustered index) affect statistics row modification counters. Non-clustered indexes are secondary structures, reflecting changes already made to the base object. They do not affect statistics row modification counters at all.
If an object has multiple unique indexes, a separate Split, Sort, Collapse combination is used to organize the updates to each. SQL Server optimizes this case for nonclustered indexes by saving the result of the Split to an Eager Table Spool, then replaying that set for each uniqu
Code Snippets
-- Per row updates
UPDATE dbo.Banana SET pk = 2 WHERE pk = 1;
UPDATE dbo.Banana SET pk = 3 WHERE pk = 2;
UPDATE dbo.Banana SET pk = 4 WHERE pk = 3;
UPDATE dbo.Banana SET pk = 5 WHERE pk = 4;DELETE dbo.Banana WHERE pk = 1;
INSERT dbo.Banana (pk, c1, c2) VALUES (2, 'A', 'W');
DELETE dbo.Banana WHERE pk = 2;
INSERT dbo.Banana (pk, c1, c2) VALUES (3, 'B', 'X');
DELETE dbo.Banana WHERE pk = 3;
INSERT dbo.Banana (pk, c1, c2) VALUES (4, 'C', 'Y');
DELETE dbo.Banana WHERE pk = 4;
INSERT dbo.Banana (pk, c1, c2) VALUES (5, 'D', 'Z');-- Sort (pk, action)
DELETE dbo.Banana WHERE pk = 1;
DELETE dbo.Banana WHERE pk = 2;
INSERT dbo.Banana (pk, c1, c2) VALUES (2, 'A', 'W');
DELETE dbo.Banana WHERE pk = 3;
INSERT dbo.Banana (pk, c1, c2) VALUES (3, 'B', 'X');
DELETE dbo.Banana WHERE pk = 4;
INSERT dbo.Banana (pk, c1, c2) VALUES (4, 'C', 'Y');
INSERT dbo.Banana (pk, c1, c2) VALUES (5, 'D', 'Z');-- Collapse (pk)
DELETE dbo.Banana WHERE pk = 1;
UPDATE dbo.Banana SET c1 = 'A', c2 = 'W' WHERE pk = 2;
UPDATE dbo.Banana SET c1 = 'B', c2 = 'X' WHERE pk = 3;
UPDATE dbo.Banana SET c1 = 'C', c2 = 'Y' WHERE pk = 4;
INSERT dbo.Banana (pk, c1, c2) VALUES (5, 'D', 'Z');-- Collapse (pk)
DELETE dbo.Banana WHERE pk = 1; -- All columns modified
UPDATE dbo.Banana SET c1 = 'A', c2 = 'W' WHERE pk = 2; -- c1 and c2 modified
UPDATE dbo.Banana SET c1 = 'B', c2 = 'X' WHERE pk = 3; -- c1 and c2 modified
UPDATE dbo.Banana SET c1 = 'C', c2 = 'Y' WHERE pk = 4; -- c1 and c2 modified
INSERT dbo.Banana (pk, c1, c2) VALUES (5, 'D', 'Z'); -- All columns modifiedContext
StackExchange Database Administrators Q#185551, answer score: 16
Revisions (0)
No revisions yet.