patternsqlMinor
Help with indexing narrow table with many millions of rows
Viewed 0 times
rowsmillionswithindexinghelpmanynarrowtable
Problem
I have a good amount of experience with SQL Server, but not much with indexes and I fear I'm a little bit out of my depth in trying to figure out how to correctly structure a table that has 100+ million rows. I know this gets asked a lot, but I haven't found an answer that matches exactly what I'm trying to do.
The table
As you can see, this is a table that stores commodity trades. Because of that, there are currently 100+ million rows in the table, with about 10,000 rows being added daily (at the end of the day).
I'm at a bit of a loss as to how to index this table to maximize performance. I don't so much care about insert performance since the table is updated once a day and it doesn't need to be a fast operation. Rows are also never updated once inserted. The big problem is query speed. This table is queried relatively often.
By far the most common queries are on the
Currently, there is a unique clustered index on the
Any help is greatly appreciated.
The table
CREATE TABLE [dbo].[Trades]
(
[Id] [int] IDENTITY(1,1) NOT NULL,
[TimeStamp] [datetime] NOT NULL,
[Type] [int] NOT NULL,
[Amount] [decimal](18, 2) NOT NULL,
[Price] [decimal](18, 2) NOT NULL,
[Exchange_Id] [int] NOT NULL
)As you can see, this is a table that stores commodity trades. Because of that, there are currently 100+ million rows in the table, with about 10,000 rows being added daily (at the end of the day).
I'm at a bit of a loss as to how to index this table to maximize performance. I don't so much care about insert performance since the table is updated once a day and it doesn't need to be a fast operation. Rows are also never updated once inserted. The big problem is query speed. This table is queried relatively often.
By far the most common queries are on the
[TimeStamp] column alone (eg, TimeStamp between two dates), or on the [TimeStamp] column together with the [Exchange_Id] column (eg, trades on a certain Exchange between two [TimeStamp]s).Currently, there is a unique clustered index on the
[Id] column, but I'm not sure if that's the right thing to do based on the circumstances. I've considered making the clustered index on the [TimeStamp] column, but it would have to be non-unique which I've read is generally bad. As for non-clustered indexes, I'm not quite sure how to arrange them or on what columns.Any help is greatly appreciated.
Solution
Based on your question, I would index the
If query performance for queries on
.. but if you do, consider including any columns that the query may need, in order to create what's known as a covering index.
Remember that there's a disk space issue involved as well, as you stated that your table contains a large number of rows. The clustered index will not change the amount of disk space your table consumes either way, but adding a non-clustered index will allocate extra space. If you
Timestamp column with the clustered index. And to make the index unique, just make sure to include the identity column in the index definition:... PRIMARY KEY CLUSTERED ([Timestamp], [Id])If query performance for queries on
Exchange_Id is still an issue after that, you can also add a non-clustered index that looks something like this:UNIQUE INDEX ([Exchange_Id], [Timestamp], [Id]).. but if you do, consider including any columns that the query may need, in order to create what's known as a covering index.
UNIQUE INDEX ([Exchange_Id], [Timestamp], [Id]) INCLUDE ([Type], [Amount], [Price])Remember that there's a disk space issue involved as well, as you stated that your table contains a large number of rows. The clustered index will not change the amount of disk space your table consumes either way, but adding a non-clustered index will allocate extra space. If you
INCLUDE all the columns from your table, like I did in the example, the non-clustered index will roughly take up as much space as the rest of the table does.Code Snippets
... PRIMARY KEY CLUSTERED ([Timestamp], [Id])UNIQUE INDEX ([Exchange_Id], [Timestamp], [Id])UNIQUE INDEX ([Exchange_Id], [Timestamp], [Id]) INCLUDE ([Type], [Amount], [Price])Context
StackExchange Database Administrators Q#83738, answer score: 4
Revisions (0)
No revisions yet.