patternsqlMajor
What is the benefit of indexing assuming that SQL server has huge memory to accomodate the entire database in memory?
Viewed 0 times
assumingthewhatentirebenefitindexingsqlaccomodatehugehas
Problem
I understand that index seek allows the server to quickly go to the desired page(s) by looking up the index, therefore the benefit is obtained by not having to read all pages of the queried table from disk into memory.
This question is assuming:
In this scenario, my question is to ask - for the subsequent queries on this table with the same WHERE clause (
This question is assuming:
- there is huge memory for the entire db to live in memory,
- the table being queried has no indexing,
- the query is to SELECT data from a table where say
rate>100
- the 1st query with above WHERE clause will do a scan to pull entire table's data pages from disk into memory (since there is no index on
rate).
In this scenario, my question is to ask - for the subsequent queries on this table with the same WHERE clause (
rate>50), the SQL engine will perform table scan on pages that already reside in memory. Does having an index or not on the rate column have any benefit to the 2nd query onwards when the entire table lives in memory and there is no need to access the disk?Solution
Doing less work is almost always faster than doing more work.
Assuming both the base table and index are fully present in memory, the index seek will do a lot less work if the table is large.
This work manifests primarily as CPU time and waiting on memory fetches. Memory is fast compared with most persistent storage, but it's still very slow compared with a CPU.
The table scan will have to access many more memory pages, and test more rows to see if
The index seek will locate the first matching row efficiently (by navigating down the b-tree), then scan to the end of the index following next-page pointers. It never needs to test individual rows to see if
Assuming a rowstore layout, the base table will have the minimum possible number of rows per 8KB page—all in-row columns are present on the page.
The index is likely to be more dense than the base table, since it only contains the indexed data. With more rows per 8KB page, fewer page accesses are needed, compared to the base table case.
There are additional considerations, like how pages belonging to the target (table or index) are located in the first place. For a heap table, this is done using IAM pages. An index seek locates the root of the b-tree from metadata, then navigates as already described. These methods are different, but for a completely in-memory situation there are unlikely to be very measurable differences between the two.
Scanning the whole table will also require more locks and page latches to be taken, consuming CPU and potentially blocking concurrent writers.
Remember an index is a separate subset of the base data sorted differently. The trade-off for the more efficient searching is additional storage and index maintenance when the base data changes.
If you're concerned about a particular situation on particular hardware, run a benchmark.
Assuming both the base table and index are fully present in memory, the index seek will do a lot less work if the table is large.
This work manifests primarily as CPU time and waiting on memory fetches. Memory is fast compared with most persistent storage, but it's still very slow compared with a CPU.
The table scan will have to access many more memory pages, and test more rows to see if
rate > x or not. Performing these comparison tests consumes CPU.The index seek will locate the first matching row efficiently (by navigating down the b-tree), then scan to the end of the index following next-page pointers. It never needs to test individual rows to see if
rate > x or not. The ordering of the index guarantees all rows found will match the predicate without making any value comparisons.Assuming a rowstore layout, the base table will have the minimum possible number of rows per 8KB page—all in-row columns are present on the page.
The index is likely to be more dense than the base table, since it only contains the indexed data. With more rows per 8KB page, fewer page accesses are needed, compared to the base table case.
There are additional considerations, like how pages belonging to the target (table or index) are located in the first place. For a heap table, this is done using IAM pages. An index seek locates the root of the b-tree from metadata, then navigates as already described. These methods are different, but for a completely in-memory situation there are unlikely to be very measurable differences between the two.
Scanning the whole table will also require more locks and page latches to be taken, consuming CPU and potentially blocking concurrent writers.
Remember an index is a separate subset of the base data sorted differently. The trade-off for the more efficient searching is additional storage and index maintenance when the base data changes.
If you're concerned about a particular situation on particular hardware, run a benchmark.
Context
StackExchange Database Administrators Q#316327, answer score: 23
Revisions (0)
No revisions yet.