snippetMinor
How do databases store index key values (on-disk) for variable length fields?
Viewed 0 times
databasesdisklengthfieldsstoreforhowvaluesindexvariable
Problem
Context
This question pertains to the low-level implementation details of indexes in both SQL and NoSQL database systems. Actual structure of the index (B+ tree, hash, SSTable, etc.) is irrelevant as the question pertains specifically to the keys stored inside a single node of any of those implementations.
Background
In SQL (e.g. MySQL) and NoSQL (CouchDB, MongoDB, etc.) databases, when you build an index on a column or JSON document field of data, what you are actually causing the database to do is create essentially a sorted list of all those values along with a file offset into the main data file where the record pertaining to that value lives.
(For simplicity sake, I may be hand-waving away other esoteric details of specific impls)
Simple Classic SQL Example
Consider a standard SQL table that has a simple 32-bit int primary key that we create an index on, we will end up with an index on-disk of the integer keys sorted and associated with a 64-bit offset into the data file where the record lives, e.g.:
The on-disk representation of the keys in the index looks something like this:
Sticking to standard rules of thumb about optimizing disk I/O with filesystems and database systems, let's say you store keys in 4KB blocks on disk, which means:
Ignoring the overall structure of the index (B+ tree, hash, sorted list, etc.) we read and write blocks of 341 keys at a time into memory and back out to disk as needed.
Example Query
Using the information from the previous section, let's say a query comes in for "id=2", the classic DB index lookup goes as follows:
This question pertains to the low-level implementation details of indexes in both SQL and NoSQL database systems. Actual structure of the index (B+ tree, hash, SSTable, etc.) is irrelevant as the question pertains specifically to the keys stored inside a single node of any of those implementations.
Background
In SQL (e.g. MySQL) and NoSQL (CouchDB, MongoDB, etc.) databases, when you build an index on a column or JSON document field of data, what you are actually causing the database to do is create essentially a sorted list of all those values along with a file offset into the main data file where the record pertaining to that value lives.
(For simplicity sake, I may be hand-waving away other esoteric details of specific impls)
Simple Classic SQL Example
Consider a standard SQL table that has a simple 32-bit int primary key that we create an index on, we will end up with an index on-disk of the integer keys sorted and associated with a 64-bit offset into the data file where the record lives, e.g.:
id | offset
--------------
1 | 1375
2 | 1413
3 | 1786The on-disk representation of the keys in the index looks something like this:
[4-bytes][8-bytes] --> 12 bytes for each indexed valueSticking to standard rules of thumb about optimizing disk I/O with filesystems and database systems, let's say you store keys in 4KB blocks on disk, which means:
4096 bytes / 12 bytes per key = 341 keys per blockIgnoring the overall structure of the index (B+ tree, hash, sorted list, etc.) we read and write blocks of 341 keys at a time into memory and back out to disk as needed.
Example Query
Using the information from the previous section, let's say a query comes in for "id=2", the classic DB index lookup goes as follows:
- Read the root of the index (in this case, 1 block)
- Binary-search the sorted block to find the key
- Get the data file offset from the value
- Look up the record in the data file using the offset
- Return
Solution
You can store your index as a list of fixed-size offsets into the block containing your key data. For example:
(well, the key data would be sorted in a real example, but you get the idea).
Note that this does not necessarily reflect how index blocks are actually constructed in any database. This is merely an example of how you might organise a block of index data where the key data is of variable length.
+--------------+
| 3 | number of entries
+--------------+
| 16 | offset of first key data
+--------------+
| 24 | offset of second key data
+--------------+
| 39 | offset of third key data
+--------------+
| key one |
+----------------+
| key number two |
+-----------------------+
| this is the third key |
+-----------------------+(well, the key data would be sorted in a real example, but you get the idea).
Note that this does not necessarily reflect how index blocks are actually constructed in any database. This is merely an example of how you might organise a block of index data where the key data is of variable length.
Code Snippets
+--------------+
| 3 | number of entries
+--------------+
| 16 | offset of first key data
+--------------+
| 24 | offset of second key data
+--------------+
| 39 | offset of third key data
+--------------+
| key one |
+----------------+
| key number two |
+-----------------------+
| this is the third key |
+-----------------------+Context
StackExchange Database Administrators Q#11189, answer score: 9
Revisions (0)
No revisions yet.