HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Rebuilding large columnstore indexed tables - am I doing it right?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
tablescolumnstorerightdoingindexedlargerebuilding

Problem

We are planning to move the database from SQL 2014 ENT into 2016 ENT.
We realized the data currently in columnstore table are not aligned in time when they are loaded, so we need to re-align the data.

Stats:

  • 3 big tables (columnstore indexed)



  • the big tables has 60+ billion rows



  • 4 data files (round robin)



  • partitioned by monthly



  • SQL Server 2014 ENT, 128 GB RAM



  • High Performance VM platform, 32 CPUs



File groups/datafiles:

  • PRIMARY



  • DATA (4 data files, each data file on 2TB disk)



  • LOG



Parition details:

  • Monthly partition function



  • All paritiona scheme on DATA FG



Table sizes (#rows):

  • T1 34,807,580,311



  • T2 16,458,306,369



  • T3 10,170,792,290



What I plan to do:

  • drop the columnstore index, create row-store index (this is to align the data in logtime)



  • then drop the row store index, convert the table into column store index (better compression and queries later)



I am concern this might took extremely long time to rebuild. Am I taking the right approach?

DDL:

```
CREATE TABLE [dbo].T1 NOT NULL,
[C6] [real] NULL,
[C7] [real] NULL,
[C8] [real] NULL,
[C9] [real] NULL,
[C10] [real] NULL,
[C11] datetime2 NULL,
[C12] [tinyint] NULL
) ON [DATA]

CREATE CLUSTERED COLUMNSTORE INDEX [T1_ColumnStoreIndex] ON [T1]

CREATE TABLE [dbo].T2 NOT NULL,
[C7] [real] NULL,
[C8] [int] NULL,
[C9] [int] NULL,
[C10] [tinyint] NULL,
[C11] [tinyint] NULL,
[C12] [tinyint] NULL
) ON [DATA]

CREATE CLUSTERED COLUMNSTORE INDEX [T2_ColumnStoreIndex] ON [T2]

CREATE TABLE [dbo].T3 NOT NULL,

Solution

What I plan to do: •drop the columnstore index, create row-store index
(this is to align the data in logtime) •then drop the row store index,
convert the table into column store index (better compression and
queries later)

You can use CREATE CLUSTERED INDEX...WITH (DROP_EXISTING=ON) to change the existing columnstore clustered index to a b-tree and then CREATE CLUSTERED COLUMNSTORE INDEX...WITH (DROP_EXISTING=ON) to change back to columnstore. This will eliminate the drop index step that changes the columnstore to a heap.

I still expect this will take quite some time with 60 billions rows. Although source data might not have been loaded in C6 order, data overall are typically loaded in roughly chronological order so there will still be a temporal relationship for data in the same proximity. I suggest you review the min and max values in sys.column_store_segments to see if this effort is justified, considering that segment elimination is done after partition elimination. The performance benefit might not be a great as you think.

Context

StackExchange Database Administrators Q#162290, answer score: 2

Revisions (0)

No revisions yet.