patternsqlMinor
Timestamp-based index
Viewed 0 times
indexbasedtimestamp
Problem
I have a big database (200GB+) that contains some log info. And I want to speed up
I'm thinking to change it to a clustered index.
Reasons for:
-
Big amount of data (~40 millions rows)
-
Column is used in multiple
Reason against:
SELECT queries and stored procedures. I have a table with a GeneratedOnUtc datetime column, and have a non-clustered index on it.I'm thinking to change it to a clustered index.
Reasons for:
-
Big amount of data (~40 millions rows)
-
Column is used in multiple
Where clauses (between, >,
-
Column is used in ROW_NUMBER() OVER (ORDER BY d.GeneratedOnUtc asc) AS Row` queriesReason against:
- Large amount of inserts (~60k per day) may lead to frequent B-tree rebuilds.
Solution
Changing your table from a heap to having a clustered index should significantly improve your performance on both queries and perhaps even on inserts. Generally speaking, your clustered index should be narrow, unique, and ever increasing. Using a datetime that you can't guarantee to be unique is not ideal because it's 8 bytes and, since it isn't unique, sql will add a four byte uniquifier to non unique rows. You may be better off using an identity column with an int as your clustered index and, since that's what the "row_number" queries are really after anyway (an ever increasing unique number) that might be a great way to go especially if you have a number of nonclustered indexes already (because the clustered is used as a row pointer by the nonclustered so it adds size to them). The 4 billion available int values leave you decades of growth.
I suggest you make a testing copy of your database and then test it with a clustered index on your datetime value and contrast that with a test using a clustered index on a new identity column (and a nonclustered index on your datetime). See which one does better with your query load. Both scenarios will outperform a heap.
I suggest you make a testing copy of your database and then test it with a clustered index on your datetime value and contrast that with a test using a clustered index on a new identity column (and a nonclustered index on your datetime). See which one does better with your query load. Both scenarios will outperform a heap.
Context
StackExchange Database Administrators Q#90277, answer score: 4
Revisions (0)
No revisions yet.