HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Timestamp-based index

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
indexbasedtimestamp

Problem

I have a big database (200GB+) that contains some log info. And I want to speed up SELECT queries and stored procedures. I have a table with a GeneratedOnUtc datetime column, and have a non-clustered index on it.

I'm thinking to change it to a clustered index.

Reasons for:

-
Big amount of data (~40 millions rows)

-
Column is used in multiple Where clauses (between, >,

-
Column is used in
ROW_NUMBER() OVER (ORDER BY d.GeneratedOnUtc asc) AS Row` queries

Reason against:

  • Large amount of inserts (~60k per day) may lead to frequent B-tree rebuilds.

Solution

Changing your table from a heap to having a clustered index should significantly improve your performance on both queries and perhaps even on inserts. Generally speaking, your clustered index should be narrow, unique, and ever increasing. Using a datetime that you can't guarantee to be unique is not ideal because it's 8 bytes and, since it isn't unique, sql will add a four byte uniquifier to non unique rows. You may be better off using an identity column with an int as your clustered index and, since that's what the "row_number" queries are really after anyway (an ever increasing unique number) that might be a great way to go especially if you have a number of nonclustered indexes already (because the clustered is used as a row pointer by the nonclustered so it adds size to them). The 4 billion available int values leave you decades of growth.

I suggest you make a testing copy of your database and then test it with a clustered index on your datetime value and contrast that with a test using a clustered index on a new identity column (and a nonclustered index on your datetime). See which one does better with your query load. Both scenarios will outperform a heap.

Context

StackExchange Database Administrators Q#90277, answer score: 4

Revisions (0)

No revisions yet.