patterncppMinor
Skip List implementation
Viewed 0 times
listimplementationskip
Problem
Skip lists are a probabilistic alternative to balanced trees, they are balanced by consulting a random number generator, that determines how many pointerscalled node level to successive elements a node will have.
Skip list is a simple data structure that allows fast search within an ordered sequence of elements. Fast search is made possible by maintaining a linked hierarchy of subsequences, each skipping over fewer elements, as depicted on the following figure:
According to the paper this data structure offers same, in some cases even better performance of alternative data structures like: AVL trees, self adjusting trees, as one can see on the following table:
with relatively low implementation cost:
"However, implementing balanced trees is an exacting
task and as a result balanced tree algorithms are rarely implemented
except as part of a programming assignment in a data
structures class.
Skip lists are a simple data structure that can be used in
place of balanced trees for most applications."
The following code implements a Skip List based on "Skip Lists: A Probabilistic Alternative to Balanced Trees" by W.Pugh.
```
#ifndef SKIP_LIST_H
#define SKIP_LIST_H
//==============================================================================
struct Skip_Node {
int key;
std::string value;
// pointers to successor nodes
std::vector forward;
Skip_Node (int k, const std::string& v, int level);
};
//==============================================================================
class Skip_list {
public:
Skip_list ();
~Skip_list ();
// non-modifying member functions
void print ();
Skip_Node* find (int searchKey);
// modifying member functions
void insert (int searchKey, std::string newValue);
void erase (int searchKey);
private:
// pointer to first node
Skip_Node* head;
// last node
Skip_Node* NIL;
// implicitly used member functions
int randomLevel ();
Skip list is a simple data structure that allows fast search within an ordered sequence of elements. Fast search is made possible by maintaining a linked hierarchy of subsequences, each skipping over fewer elements, as depicted on the following figure:
According to the paper this data structure offers same, in some cases even better performance of alternative data structures like: AVL trees, self adjusting trees, as one can see on the following table:
with relatively low implementation cost:
"However, implementing balanced trees is an exacting
task and as a result balanced tree algorithms are rarely implemented
except as part of a programming assignment in a data
structures class.
Skip lists are a simple data structure that can be used in
place of balanced trees for most applications."
The following code implements a Skip List based on "Skip Lists: A Probabilistic Alternative to Balanced Trees" by W.Pugh.
Skip_list.h```
#ifndef SKIP_LIST_H
#define SKIP_LIST_H
//==============================================================================
struct Skip_Node {
int key;
std::string value;
// pointers to successor nodes
std::vector forward;
Skip_Node (int k, const std::string& v, int level);
};
//==============================================================================
class Skip_list {
public:
Skip_list ();
~Skip_list ();
// non-modifying member functions
void print ();
Skip_Node* find (int searchKey);
// modifying member functions
void insert (int searchKey, std::string newValue);
void erase (int searchKey);
private:
// pointer to first node
Skip_Node* head;
// last node
Skip_Node* NIL;
// implicitly used member functions
int randomLevel ();
Solution
Your code is a real pleasure to read because it is clean and well written - I wish more of the code here at CR were like this.
Stilistically the focus could be improved by factoring all key properties into a traits class. This would make the skip list code key-agnostic, so that it could be used without change for different kinds of key. This can be advantageous even before the class is turned into a template, and it would make the step from concrete class to a class template a lot easier.
At the moment the maximum node height is fixed; this means it is bound to be overkill for small numbers of elements and insufficient for big numbers. Most of the code could simply look at the head node to find the current max but the insert code would have to decide whether an increase is warranted (i.e. whenever the element count gains a new 1-bit at the front then a new 'express-way' should be opened by increasing the head node height). This would give more even performance over greater ranges of element counts than is currently the case.
Using a
It can be very instructive to run systematic performance tests with geometrically increasing element counts (e.g. 10^1, 10^2, ..., 10^10) separately for search, insert and delete, and to compare the results to (sorted)
It can be similarly instructive to compare expected behaviour to actual behaviour. E.g. compare theoretically predicted comparison counts to actual comparison counts by instrumenting the code, for example with an interposer class for the key traits that wraps the
There's a well-written article on skip lists on MSDN that is very good for getting into the spirit of things. It's part of a series of articles on data structures: Part 4: Building a Better Binary Search Tree It starts with binary trees and lists but more than half of it is devoted to skip lists.
Stilistically the focus could be improved by factoring all key properties into a traits class. This would make the skip list code key-agnostic, so that it could be used without change for different kinds of key. This can be advantageous even before the class is turned into a template, and it would make the step from concrete class to a class template a lot easier.
At the moment the maximum node height is fixed; this means it is bound to be overkill for small numbers of elements and insufficient for big numbers. Most of the code could simply look at the head node to find the current max but the insert code would have to decide whether an increase is warranted (i.e. whenever the element count gains a new 1-bit at the front then a new 'express-way' should be opened by increasing the head node height). This would give more even performance over greater ranges of element counts than is currently the case.
Using a
std::vector in every node gives a lot of flexibility and greatly facilitates experimentation. However, it also makes the skip list very resource-hungry. Since height of nodes is fixed at creation (apart from head nodes, but those can be dealt with differently) an alternative for high-performance production implementations would be a special-purpose allocator in the spirit of buddy systems.It can be very instructive to run systematic performance tests with geometrically increasing element counts (e.g. 10^1, 10^2, ..., 10^10) separately for search, insert and delete, and to compare the results to (sorted)
std::vector<>, std::map<>, std::unordered_map<> and so on, to get a feel for how skip lists behave. Unfortunately there's no B-tree template in the standard, so there's no data structure available for comparison that can take a bit of load (i.e. that's suitable for more than homoeopathic numbers of elements).It can be similarly instructive to compare expected behaviour to actual behaviour. E.g. compare theoretically predicted comparison counts to actual comparison counts by instrumenting the code, for example with an interposer class for the key traits that wraps the
less() predicate.There's a well-written article on skip lists on MSDN that is very good for getting into the spirit of things. It's part of a series of articles on data structures: Part 4: Building a Better Binary Search Tree It starts with binary trees and lists but more than half of it is devoted to skip lists.
Context
StackExchange Code Review Q#116345, answer score: 7
Revisions (0)
No revisions yet.