patternMinor
does the mere existence of a primary key increase query speed for large data sets?
Viewed 0 times
theprimaryexistencequerylargeincreasemeredoesforsets
Problem
So, I have a table with about 2.4M rows and ~30 columns. There is no unique column within the data. The data is transaction logs from stores where each transaction number is only unique on a per store basis and resets at midnight, but that's besides the point.
As it stands, queries on this table take upwards of one minute and I can't figure out why. I have added indexes on columns that have high selectivity ratio but it hasn't made any difference. So, if I add an auto increment 'id' field for each row, would that speed things up, or are the queries slow because there are so many columns and so much unique data?
As it stands, queries on this table take upwards of one minute and I can't figure out why. I have added indexes on columns that have high selectivity ratio but it hasn't made any difference. So, if I add an auto increment 'id' field for each row, would that speed things up, or are the queries slow because there are so many columns and so much unique data?
Solution
No, having a surrogate key would not speed things up.
You should consider what your queries are doing, and whether your indexes are sufficient.
For instance, if you are trying to find out the total sales for Widgets in Timbuktu on Christmas Day, then do you have an index on ProductID, StoreID, TransactionDate that also includes the SalesAmount? If not, how will that index manage to help you sum the SalesAmount column?
Also consider Sargability. To query against TransactionDate for records during December, you shouldn't apply any functions to the column, you should do something like:
You should consider what your queries are doing, and whether your indexes are sufficient.
For instance, if you are trying to find out the total sales for Widgets in Timbuktu on Christmas Day, then do you have an index on ProductID, StoreID, TransactionDate that also includes the SalesAmount? If not, how will that index manage to help you sum the SalesAmount column?
Also consider Sargability. To query against TransactionDate for records during December, you shouldn't apply any functions to the column, you should do something like:
AND t.TransactionDate >= '20131201' AND t.TransactionDate < '20140101' because while you may have an index which involves TransactionDate, you almost certainly don't have an index involving 'TransactionDate converted to a month'.Context
StackExchange Database Administrators Q#82736, answer score: 4
Revisions (0)
No revisions yet.