patterncppMinor
An intrusive chained Hash Table
Viewed 0 times
tableintrusivechainedhash
Problem
I was interested in experimenting with some intrusive versions of common data structures, so I've written this simple chained hash-table using the approach:
```
#ifndef INTRUSIVE_HASH_TABLE_HPP
#define INTRUSIVE_HASH_TABLE_HPP
#include
#include
#include
#include
// Types that will be inserted into the IntrusiveHashTable
// must inherit from this template class.
template
class HashTableNode
{
public:
// Hash-table needs access to internal data of its nodes.
template
friend class IntrusiveHashTable;
// IntrusiveHashTable interface:
std::size_t getHashTableKeyHash() const { return htKeyHash; }
T * getHashTableNext() const { return htNext; }
bool isLinkedToHashTable() const { return htKeyHash != 0; }
protected:
HashTableNode() : htNext(nullptr), htKeyHash(0) { }
~HashTableNode() { }
private:
// Next node in the IntrusiveHashTable bucket chain.
// Null if this is the last item or if not linked.
T * htNext;
// Hash of the node's key.
// Zero only if not linked to a table.
std::size_t htKeyHash;
};
// This hash-table does not manage memory or lifetime of the objects inserted.
// Removing an item from the table WILL NOT destroy the object, just unlink it.
template
// Hashes a key instance
>
class IntrusiveHashTable
{
public:
// Nested typedefs:
using ValueType = V;
using KeyHasher = HASH;
using KeyType = typename std::remove_cv::type;
// No copy or assignment:
IntrusiveHashTable(const IntrusiveHashTable &) = delete;
IntrusiveHashTable & operator = (const IntrusiveHashTable &) = delete;
// Construct empty (no allocation). Allocates on first insertion.
explicit IntrusiveHashTable(bool allowDuplicateKeys);
// Construct and allocate storage with num buckets hint.
IntrusiveHashTable(bool allowDuplicateKeys, std::size_t sizeHint);
// Destructor clears the table and unlinks all items.
~IntrusiveHashTable();
// E
```
#ifndef INTRUSIVE_HASH_TABLE_HPP
#define INTRUSIVE_HASH_TABLE_HPP
#include
#include
#include
#include
// Types that will be inserted into the IntrusiveHashTable
// must inherit from this template class.
template
class HashTableNode
{
public:
// Hash-table needs access to internal data of its nodes.
template
friend class IntrusiveHashTable;
// IntrusiveHashTable interface:
std::size_t getHashTableKeyHash() const { return htKeyHash; }
T * getHashTableNext() const { return htNext; }
bool isLinkedToHashTable() const { return htKeyHash != 0; }
protected:
HashTableNode() : htNext(nullptr), htKeyHash(0) { }
~HashTableNode() { }
private:
// Next node in the IntrusiveHashTable bucket chain.
// Null if this is the last item or if not linked.
T * htNext;
// Hash of the node's key.
// Zero only if not linked to a table.
std::size_t htKeyHash;
};
// This hash-table does not manage memory or lifetime of the objects inserted.
// Removing an item from the table WILL NOT destroy the object, just unlink it.
template
// Hashes a key instance
>
class IntrusiveHashTable
{
public:
// Nested typedefs:
using ValueType = V;
using KeyHasher = HASH;
using KeyType = typename std::remove_cv::type;
// No copy or assignment:
IntrusiveHashTable(const IntrusiveHashTable &) = delete;
IntrusiveHashTable & operator = (const IntrusiveHashTable &) = delete;
// Construct empty (no allocation). Allocates on first insertion.
explicit IntrusiveHashTable(bool allowDuplicateKeys);
// Construct and allocate storage with num buckets hint.
IntrusiveHashTable(bool allowDuplicateKeys, std::size_t sizeHint);
// Destructor clears the table and unlinks all items.
~IntrusiveHashTable();
// E
Solution
Overall, it looks good, here are a few points you can improve on:
HashTableNode:
-
You could use direct initialization of the member data and omit the default constructor. However, if your intention is to make construction
IntrusiveHashTable:
-
Applying
-
Move-semantics | NRVO can elide the copy for the return value, so this will be quite efficient too.
-
The internal array of buckets (
I would recommend a
Then in places where you have a loop like this:
Could be replaced with a range based for loop.
-
You can use the trailing return type syntax to simplify the implementation of a few methods. Instead of using the old verbose syntax involving
It can be shortened to:
That simplification can be applied to several methods in the
-
Remember that assertions can be disabled!
This assertion might be disabled on a "release" build, so if that condition should ever happen, the error would go unnoticed. While this approach suffices in some cases, a more robust one would be throwing an exception.
HashTableNode:
-
You could use direct initialization of the member data and omit the default constructor. However, if your intention is to make construction
protected, you'll still need a default constructor inside that section. The same is valid for the destructor, which is empty and so should also be a default:template
class HashTableNode
{
...
protected:
HashTableNode() = default;
~HashTableNode() = default;
private:
T * htNext = nullptr;
std::size_t htKeyHash = 0;
};IntrusiveHashTable:
-
Applying
std::remove_cv on the key type seems unnecessary. It works just as well without it.-
findAllMatching() is written in C-style, that is, taking an array of pointers of a fixed size. It would be more modern to return a vector instead, which also wouldn't limit the number of matches by the size of the input array:std::vector findAllMatching(const KeyType & key) const;Move-semantics | NRVO can elide the copy for the return value, so this will be quite efficient too.
-
The internal array of buckets (
ValueType ** table) is a raw pointer allocated with new and deleted in the destructor. Even though this pointer is encapsulated inside a class with well defined copy behavior, you could clean up the code further by using a unique_ptr or a vector to manage memory. This would also better separate responsibilities. Code that does too much is more prone to bugs (separation of concerns).I would recommend a
std::vector, which then makes bucketCount obsolete (thanks to the size() method) and also provides iterators and debug bounds checking for your iterations.Then in places where you have a loop like this:
for (std::size_t bucket = 0; bucket < bucketCount; ++bucket)Could be replaced with a range based for loop.
-
You can use the trailing return type syntax to simplify the implementation of a few methods. Instead of using the old verbose syntax involving
typename:template
typename IntrusiveHashTable::ValueType * IntrusiveHashTable::find(const KeyType & key) const { /*... */ }It can be shortened to:
template
auto IntrusiveHashTable::find(const KeyType & key) const -> ValueType * { /* ... */ }That simplification can be applied to several methods in the
IntrusiveHashTable class.-
Remember that assertions can be disabled!
...
else
{
assert(false && "IntrusiveHashTable bucket chain is corrupted!");
}This assertion might be disabled on a "release" build, so if that condition should ever happen, the error would go unnoticed. While this approach suffices in some cases, a more robust one would be throwing an exception.
Code Snippets
template<class T>
class HashTableNode
{
...
protected:
HashTableNode() = default;
~HashTableNode() = default;
private:
T * htNext = nullptr;
std::size_t htKeyHash = 0;
};std::vector<ValueType *> findAllMatching(const KeyType & key) const;for (std::size_t bucket = 0; bucket < bucketCount; ++bucket)template<class K, class V, class HASH>
typename IntrusiveHashTable<K, V, HASH>::ValueType * IntrusiveHashTable<K, V, HASH>::find(const KeyType & key) const { /*... */ }template<class K, class V, class HASH>
auto IntrusiveHashTable<K, V, HASH>::find(const KeyType & key) const -> ValueType * { /* ... */ }Context
StackExchange Code Review Q#83490, answer score: 3
Revisions (0)
No revisions yet.