HiveBrain v1.2.0
Get Started
← Back to all entries
snippetMinor

How do databases store index key values (on-disk) for variable length fields?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
databasesdisklengthfieldsstoreforhowvaluesindexvariable

Problem

Context

This question pertains to the low-level implementation details of indexes in both SQL and NoSQL database systems. Actual structure of the index (B+ tree, hash, SSTable, etc.) is irrelevant as the question pertains specifically to the keys stored inside a single node of any of those implementations.

Background

In SQL (e.g. MySQL) and NoSQL (CouchDB, MongoDB, etc.) databases, when you build an index on a column or JSON document field of data, what you are actually causing the database to do is create essentially a sorted list of all those values along with a file offset into the main data file where the record pertaining to that value lives.

(For simplicity sake, I may be hand-waving away other esoteric details of specific impls)

Simple Classic SQL Example

Consider a standard SQL table that has a simple 32-bit int primary key that we create an index on, we will end up with an index on-disk of the integer keys sorted and associated with a 64-bit offset into the data file where the record lives, e.g.:

id   | offset
--------------
1    | 1375
2    | 1413
3    | 1786


The on-disk representation of the keys in the index looks something like this:

[4-bytes][8-bytes] --> 12 bytes for each indexed value


Sticking to standard rules of thumb about optimizing disk I/O with filesystems and database systems, let's say you store keys in 4KB blocks on disk, which means:

4096 bytes / 12 bytes per key = 341 keys per block


Ignoring the overall structure of the index (B+ tree, hash, sorted list, etc.) we read and write blocks of 341 keys at a time into memory and back out to disk as needed.

Example Query

Using the information from the previous section, let's say a query comes in for "id=2", the classic DB index lookup goes as follows:

  • Read the root of the index (in this case, 1 block)



  • Binary-search the sorted block to find the key



  • Get the data file offset from the value



  • Look up the record in the data file using the offset



  • Return

Solution

You can store your index as a list of fixed-size offsets into the block containing your key data. For example:

+--------------+
| 3            | number of entries
+--------------+
| 16           | offset of first key data
+--------------+
| 24           | offset of second key data
+--------------+
| 39           | offset of third key data
+--------------+
| key one |
+----------------+
| key number two |
+-----------------------+
| this is the third key |
+-----------------------+


(well, the key data would be sorted in a real example, but you get the idea).

Note that this does not necessarily reflect how index blocks are actually constructed in any database. This is merely an example of how you might organise a block of index data where the key data is of variable length.

Code Snippets

+--------------+
| 3            | number of entries
+--------------+
| 16           | offset of first key data
+--------------+
| 24           | offset of second key data
+--------------+
| 39           | offset of third key data
+--------------+
| key one |
+----------------+
| key number two |
+-----------------------+
| this is the third key |
+-----------------------+

Context

StackExchange Database Administrators Q#11189, answer score: 9

Revisions (0)

No revisions yet.