HiveBrain v1.2.0
Get Started
← Back to all entries
principlesqlMinor

Clustered Index fragmentation vs Index with Included columns fragmentation

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
clusteredfragmentationcolumnswithincludedindex

Problem

I have a relative large table (for me) with 40 million rows that is expected to grow to 80 to 120 million rows in a period of two weeks (during a campaign).

Tip
--------------
Id          int (clustered index)
UserId      int
TipIndex    smallint
Value       binary(8)
LastChanged datetime2(3)


  • every user has between 1 to 400 tips that will be inserted at random during this period



  • UserId + TipIndex is unique



  • I never query Tips directly on the Identity Key (Id)



  • 99% of the times I query on UserId



  • I need all columns



  • I often query on 1 UserId (per page view), sometimes on a batches of 10.000s for statistics



  • this is a high traffic site during this period and it should be able to handle 30.000 queries on UserId per minute



  • Id is at the moment my Clustered Index, because I read that it leads to the least fragmentation.



So I'm hosting on SQL Azure and Azure already recommended adding an Index with included columns.
I was always hesitant on using UserId,TipIndex as a Clustered Index, as Tips will be added at random. Meaning that I'm afraid of huge fragmentation problems etc.

My questions:

  • Doesn't an Index with included columns have the exact same problem?



  • Is a table with included columns not just the same as a "shadow table" with the same fragmentation problems?



  • Should I migrate to use UserId, TipIndex as a ClusteredIndex instead of Id?



  • How to prevent fragmentation?



I know in the end the answer is always "depends" or that I should measure it. But as I'm a solo developer and not with a lot of resources I am hoping on someone with more experience that has a gut feeling for this, so my first attempts have a higher chance of going in the right direction.

Solution

Your questions:

Doesn't an Index with included columns have the exact same problem?

Yes.

Is a table with included columns not just the same as a "shadow table" with the same fragmentation problems?

Yes

Should I migrate to use UserId, TipIndex as a ClusteredIndex instead of Id?

I would, yes.

How to prevent fragmentation?

There are a couple of different types of fragmentation to consider. One is when you only have part of your pages being used because you’ve had page splits. If you have a lot of inserts, this will happen. Don’t stress too much. The other is when you have pages where the subsequent page is in a different extent. Again, I wouldn’t worry too much. If your data is mostly in the buffer cache, it doesn’t really matter if it moves across extents.

So... don’t worry about it too much. But don’t bother having a complete copy of the data in a way that you won’t actually be querying it.

Context

StackExchange Database Administrators Q#282281, answer score: 6

Revisions (0)

No revisions yet.