HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Data structure for efficient searching, when insertions and removals are only one-sided

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
searchingareefficientonesidedforstructurewhenandremovals

Problem

I need a data structure for storing a number $n$ of elements, each of whom is associated with some different time $t_i$. $n$ varies and while it has a theoretical upper limit, this is many orders of magnitude larger than what is typically used.

Through my application I can ensure that:

-
Inserted elements are always newer than all existing elements, i.e., if an element associated with a time $\check{t}$ is inserted, then $\check{t}>t_i ∀ i ∈ {1,…,n}$. Elements are inserted one by one.

-
Only the oldest elements are removed, i.e., if element $j$ is removed, then $t_j

-
Apart from inserting and removing, the only thing I need to do is find the two neighbouring elements for some given time $\tilde{t}$ with $\min\limits_i t_i

My criteria for the data structure are:

  • Finding elements as described above should be as quick as possible.



  • Inserting and removing should be quick.



  • The data structure is comparably simple to implement.



As long as we are not talking about a small runtime offset, each criterion has priority over the next.

My research so far has yielded that the answer is likely some kind of self-balancing search tree, but I failed to find any information which of them is best for the case of one-sided inserting or deleting, and it will probably cost me a considerable time to find out myself.
Also, I only found incomplete information about how well the trees self-organise and how quickly (e.g., AVL trees self-organise more rigidly than red-black trees), let alone how this is affected by one-sided inserting or deleting.

Solution

Store the elements as a sequence, sorted by increasing timestamp. Use binary search to find the location where $\tilde{t}$ would occur if it were in the array; then you can easily find the two neighboring elements. Finding the two neighboring elements can be done in $O(\lg n)$ time.

You'll also need to be able to append to the end of the sequence and delete from the beginning. Thus, basically you need a queue.

There are standard constructions for a queue. For instance, you can store them in an array, with amortized $O(1)$-time insert and delete operations. Basically, you have an array for the elements of the sequence, and a start index (for the beginning of the sequence) and an end index (for the end of the sequence). To delete from the beginning, increment the start index. To add to the end, increment the end index; if this runs past the end of the existing array, allocate a new array of double the size and copy into the new array.

Alternatively: you can store the elements in a balanced binary tree. This will achieve worst-case $O(\lg n)$-time for all operations.

Context

StackExchange Computer Science Q#55703, answer score: 5

Revisions (0)

No revisions yet.