HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

time efficient key value store for fast lookup

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
fastefficientvaluetimestoreforlookupkey

Problem

Let's state a collection of key/values.

Key is an unsigned integer (0 - 2^32-1). Hense there are comparable.
Value is few bytes fixed size.

There are 4M items in the collection.
Keys are not evenly spread in the key space i.e there is no way to predict hole size between two consecutive keys.

This collection is created once and never modify after that, even if not prohibited.

The aim is to lookup a value by its key. But if key is not found then the previous one (as in integer natural order) must be returned.

I'm looking for a data-structure that can be mmap on disk which would be reasonably space efficient but most importantly very time efficient for
the lookup function.

There is no known pattern on how the keys are queried.

Solution

Make a table with 2^22 entries to lookup the highest 22 bits of the key. Each entry is responsible for one value on average (but may contain up to 1024).

Entry #i in the table, which is responsible for 0 to 1024 values with keys from 1024i to 1024i + 1023, contains the following: 1. The number of keys in that range. 2. If there are 0 keys, the value of the next lower available key. If there is 1 key, the lower 10 bit of the key in that range, and its value. If there are two or more keys, an index j into a second table.

In the second table, we store the lower 10 bit of the key, plus the value, for every key where there are two or more keys in the range from 1024i to 1024i + 1023. For values in the second table, we use binary search.

Context

StackExchange Computer Science Q#77265, answer score: 3

Revisions (0)

No revisions yet.