HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Tree/array mashup

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
arraymashuptree

Problem

I'm currently trying to prove a data structure which I've created to speed up lookups for integral keys. The specific case I have in mind has the following usage case:

  • At start is populated with n number of items (where n can be from 1 to thousands), and the specific key is actually an unsigned int.



  • The critical path in the code requires a lookup based on this key.



I've tried the following:

  • sorted vector with binary search



  • std::map



  • boost::unordered_map



And now this tree. I've found through profiling that the following holds

  • If every item looked up exists, then unordered_set is quicker than std::map which is quicker than sorted vector



  • If the number of successful lookups for the search set is low, then std::map is quicker than unordered_set (I guess the hash overhead is higher for failure case)



With this tree, I've found that it's quicker in all cases, but what I would like is some critique on the approach - are there any further optimisations that I am missing for example, and here is the tree:

```
/** Copyright: Nim 2011 */
template
class tree
{
public:
typedef ValueType value_type;
typedef ValueType& reference;
typedef ValueType const& const_reference;
typedef KeyType key_type;
typedef ValueType* iterator;
typedef ValueType const* const_iterator;

private:
static const size_t MAX_SIZE = 256;

typedef unsigned char index_type;

struct node
{
node() : _cKey(), _cValue(), _vNodes() {}

~node()
{
cleanup();
}

void add(key_type index, key_type key, const_reference value)
{
if (!index)
{
_cKey = key;
_cValue.reset(new value_type(value));
return;
}
// get the lsb
index_type lsb = index & 0xff;
if (!_vNodes)
{
_vNodes = new node*[MAX_SIZE]();
_vNodes[lsb] = new node();
}
else if (!_vNodes[lsb])
_vNodes[lsb] = new node();

_vNodes[lsb]->add(index >>

Solution

Just a few points (caveat: I'm not a C++ expert).

-
Comparing structured (i.e., non-scalar) keys is often the most expensive part of doing a search; it is usually worth searching first on an integer hash of the key. This way you only need to do integer comparisons for the most part.

-
I am AMAZINGLY surprised that binary search on a sorted array doesn't beat the pants off everything else. Binary search is vastly simpler than anything else and has optimal pruning. I seriously suspect your results here are to do with issue (1) above.

-
You don't say how much your results differed in your experiments. If the differences are small, it probably isn't worth rolling your own code.

Context

StackExchange Code Review Q#3294, answer score: 2

Revisions (0)

No revisions yet.