patternsqlMinor
SQL Server 2019 column store indexes - maintenance
Viewed 0 times
columnsql2019storeindexesmaintenanceserver
Problem
I have a clustered columned stored index on table used for logging - only inserts (but not bulk inserts).
The current table stats are:
I see this morning the following operation via
I use the following query to check how many rows per
and we
The issue is that we are using standard edition (on premise) and the rebuild operation is not performed online and cause a lot of blocking.
I have never seen such issue before. A few weeks ago we have upgraded from
My questions are:
The current table stats are:
- 3541 millions rows
- 6.6 GB reserved space
I see this morning the following operation via
sp_whoisactive:ALTER INDEX [...] ON [...].[...]
REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = COLUMNSTORE_ARCHIVE);I use the following query to check how many rows per
row_group_id we have:SELECT
tables.name AS table_name,
indexes.name AS index_name,
partitions.partition_number,
dm_db_column_store_row_group_physical_stats.row_group_id,
dm_db_column_store_row_group_physical_stats.total_rows,
dm_db_column_store_row_group_physical_stats.deleted_rows,
dm_db_column_store_row_group_physical_stats.state_desc,
dm_db_column_store_row_group_physical_stats.trim_reason_desc
FROM sys.dm_db_column_store_row_group_physical_stats
INNER JOIN sys.indexes
ON indexes.index_id =
dm_db_column_store_row_group_physical_stats.index_id
AND indexes.object_id =
dm_db_column_store_row_group_physical_stats.object_id
INNER JOIN sys.tables
ON tables.object_id = indexes.object_id
INNER JOIN sys.partitions
ON partitions.partition_number =
dm_db_column_store_row_group_physical_stats.partition_number
AND partitions.index_id = indexes.index_id
AND partitions.object_id = tables.object_idand we
3383 row groups with 1048576 rows and few at end like this:The issue is that we are using standard edition (on premise) and the rebuild operation is not performed online and cause a lot of blocking.
I have never seen such issue before. A few weeks ago we have upgraded from
SQL Server 2016 SP1 to SQL Server 2019.My questions are:
- if only inserts are applied, should be the operation
reorganizeand be faster
- if not, if we apply partitioning, for example on year basis as the table is used for logging, will the automation process only rebuild the data for the last partition
Solution
I have a clustered columned stored index on table used for logging - only inserts
if only inserts are applied, should be the operation reorganize and be faster
You should not even bother. Reorganize for a columnstore does:
Physically removes rows from a rowgroup when 10% or more of the rows
have been logically deleted. The deleted bytes are reclaimed on the
physical media. For example, if a compressed row group of 1 million
rows has 100K rows deleted, SQL Server will remove the deleted rows
and recompress the rowgroup with 900k rows. It saves on the storage by
removing deleted rows.
Combines one or more compressed rowgroups to increase rows per
rowgroup up to the maximum of 1,048,576 rows. For example, if you bulk
import 5 batches of 102,400 rows you will get 5 compressed rowgroups.
If you run REORGANIZE, these rowgroups will get merged into 1
compressed rowgroup of size 512,000 rows. This assumes there were no
dictionary size or memory limitations.
For rowgroups in which 10% or more of the rows have been logically
deleted, the Database Engine tries to combine this rowgroup with one
or more rowgroups. For example, rowgroup 1 is compressed with 500,000
rows and rowgroup 21 is compressed with the maximum of 1,048,576 rows.
Rowgroup 21 has 60% of the rows deleted which leaves 409,830 rows. The
Database Engine favors combining these two rowgroups to compress a new
rowgroup that has 909,830 rows.
Considerations specific to reorganizing a columnstore index
So all it will do is combine the open row groups into a new compressed rowgroup. Then the next time you insert anything you'll get new open rowgroups. So there's no real benefit to REORGANIZE in your scenario.
As J.D. advises, you might partition this table so if you want to apply archival compression only to older partitions. But your compression is already quite good.
if only inserts are applied, should be the operation reorganize and be faster
You should not even bother. Reorganize for a columnstore does:
Physically removes rows from a rowgroup when 10% or more of the rows
have been logically deleted. The deleted bytes are reclaimed on the
physical media. For example, if a compressed row group of 1 million
rows has 100K rows deleted, SQL Server will remove the deleted rows
and recompress the rowgroup with 900k rows. It saves on the storage by
removing deleted rows.
Combines one or more compressed rowgroups to increase rows per
rowgroup up to the maximum of 1,048,576 rows. For example, if you bulk
import 5 batches of 102,400 rows you will get 5 compressed rowgroups.
If you run REORGANIZE, these rowgroups will get merged into 1
compressed rowgroup of size 512,000 rows. This assumes there were no
dictionary size or memory limitations.
For rowgroups in which 10% or more of the rows have been logically
deleted, the Database Engine tries to combine this rowgroup with one
or more rowgroups. For example, rowgroup 1 is compressed with 500,000
rows and rowgroup 21 is compressed with the maximum of 1,048,576 rows.
Rowgroup 21 has 60% of the rows deleted which leaves 409,830 rows. The
Database Engine favors combining these two rowgroups to compress a new
rowgroup that has 909,830 rows.
Considerations specific to reorganizing a columnstore index
So all it will do is combine the open row groups into a new compressed rowgroup. Then the next time you insert anything you'll get new open rowgroups. So there's no real benefit to REORGANIZE in your scenario.
As J.D. advises, you might partition this table so if you want to apply archival compression only to older partitions. But your compression is already quite good.
Context
StackExchange Database Administrators Q#282893, answer score: 5
Revisions (0)
No revisions yet.