patterncppMinor
Red-Black Tree Implementation
Viewed 0 times
implementationredtreeblack
Problem
This is my implementation of a red-black tree that I'm planning to use in a little personal project. It works fine, however I'm not sure that the code is very good as I'm only a beginner, and insertions are 4x slower than in a broken implementation I found here (it stops working after a few insertions).
I was wondering if I did something unnecessary that slows down the algorithm or is it just because my nodes also have keys instead of just data. I'd greatly appreciate any suggestions and corrections.
```
enum Color { Red = 0, Black };
template
class RBTree
{
public:
RBTree();
~RBTree();
void Insert(TKey Key, TData Data);
void Delete(TKey Key);
TData * Find(TKey Key);
private:
struct Node
{
Node(const TKey & Key, const TData & Data, Node * Parent);
~Node();
Color _Color;
Node * _Link[2];
Node * _Parent;
TKey _Key;
TData _Data;
};
Node * _Root;
};
template
inline RBTree::RBTree() :
_Root(nullptr)
{
}
template
inline RBTree::~RBTree()
{
delete _Root;
}
template
inline void RBTree::Insert(TKey Key, TData Data)
{
Node * pNode = nullptr;
Node * pParent = nullptr;
Node * pUncle = nullptr;
Node * pGrandparent = nullptr;
Node * T = nullptr;
int Dir;
int PDir;
if (!_Root)
{
_Root = new Node(Key, Data, nullptr);
_Root->_Color = Black;
}
else
{
pNode = _Root;
while (pNode)
{
pParent = pNode;
Dir = Key > pNode->_Key;
pNode = pNode->_Link[Dir];
}
pParent->_Link[Dir] = new Node(Key, Data, pParent);
pNode = pParent->_Link[Dir];
TKey k;
TData d;
Node * n;
while (pNode->_Parent && pNode->_Parent != _Root && pParent->_Color == Red)
{
pGrandparent = pParent->_Parent;
PDir = pParent->_Data > pGrandparent->_Data;
pUncle = pGrandparent->_Link[!PDir];
I was wondering if I did something unnecessary that slows down the algorithm or is it just because my nodes also have keys instead of just data. I'd greatly appreciate any suggestions and corrections.
```
enum Color { Red = 0, Black };
template
class RBTree
{
public:
RBTree();
~RBTree();
void Insert(TKey Key, TData Data);
void Delete(TKey Key);
TData * Find(TKey Key);
private:
struct Node
{
Node(const TKey & Key, const TData & Data, Node * Parent);
~Node();
Color _Color;
Node * _Link[2];
Node * _Parent;
TKey _Key;
TData _Data;
};
Node * _Root;
};
template
inline RBTree::RBTree() :
_Root(nullptr)
{
}
template
inline RBTree::~RBTree()
{
delete _Root;
}
template
inline void RBTree::Insert(TKey Key, TData Data)
{
Node * pNode = nullptr;
Node * pParent = nullptr;
Node * pUncle = nullptr;
Node * pGrandparent = nullptr;
Node * T = nullptr;
int Dir;
int PDir;
if (!_Root)
{
_Root = new Node(Key, Data, nullptr);
_Root->_Color = Black;
}
else
{
pNode = _Root;
while (pNode)
{
pParent = pNode;
Dir = Key > pNode->_Key;
pNode = pNode->_Link[Dir];
}
pParent->_Link[Dir] = new Node(Key, Data, pParent);
pNode = pParent->_Link[Dir];
TKey k;
TData d;
Node * n;
while (pNode->_Parent && pNode->_Parent != _Root && pParent->_Color == Red)
{
pGrandparent = pParent->_Parent;
PDir = pParent->_Data > pGrandparent->_Data;
pUncle = pGrandparent->_Link[!PDir];
Solution
Issues
Rule of Three/Five
You don't obey the rule of Three (or Five). Thus your code can easily break. Did you know that the compiler generates the copy constructor and assignment operator for you? The compiler-generated version does not work well with owned RAW pointers (yours are owned because you define the destructor).
You can use the rule of Zero to help you out (by using smart pointers). But personally, I think you should implement the rule of three as you are building a container.
Moving objects
Your container takes ownership of the data (and key). It does so by making a copy of the object passed. You could potentially make your code much more efficient by correctly implementing move semantics on your insert:
General Comments
Avoid underscore
Avoid underscore as the first character in an identifier. The rules are complex and even those that think they know the rules get it wrong.
You do get it wrong:
All these members names are reserved for use by the implementation. See What are the rules about using an underscore in a C++ identifier?.
If you must use a prefix to identify members then
While we are on naming conventions: types are the most important part of creating C++, so make sure the types are easily identifiable by the reader. As such, a very common convention is to name user-defined types with an initial uppercase letter and objects (variables/functions) with an initial lowercase letter.
While we are still on naming conventions:
Order Of Member Initialization
The members of an object are initialized in the same order as declaration:
Your order here is not the same as the order of declaration. This is not going to cause an issue in this situation but it is untidy and bad practice (any you should have been warned about this by the compiler).
Always put the initializer list in the same order as declaration. That way, when it does matter you can easily spot it.
Also like normal variable declaration, please initialize one variable per line. Remember you are writing this code so that other people (that could be you in 6 months) can read it.
Don't declare variables until needed.
The
Self-documenting code
I am trying to read and understand the
That would be much more logical to read as something like:
Rule of Three/Five
You don't obey the rule of Three (or Five). Thus your code can easily break. Did you know that the compiler generates the copy constructor and assignment operator for you? The compiler-generated version does not work well with owned RAW pointers (yours are owned because you define the destructor).
You can use the rule of Zero to help you out (by using smart pointers). But personally, I think you should implement the rule of three as you are building a container.
Moving objects
Your container takes ownership of the data (and key). It does so by making a copy of the object passed. You could potentially make your code much more efficient by correctly implementing move semantics on your insert:
Insert(K const& key, V const& value); // The one you implement
Insert(K&& key, V&& value); // Move the key and value.
template
Insert(K const& key, Args... args); // Build the value in placeGeneral Comments
Avoid underscore
Avoid underscore as the first character in an identifier. The rules are complex and even those that think they know the rules get it wrong.
You do get it wrong:
Color _Color;
Node * _Link[2];
Node * _Parent;
TKey _Key;
TData _Data;All these members names are reserved for use by the implementation. See What are the rules about using an underscore in a C++ identifier?.
If you must use a prefix to identify members then
m_ seems popular. Personally, I think using a prefix is bad style. If you need to use a prefix that means you are using weak member names that need to be qualified. A better solution is to use more meaningful names.While we are on naming conventions: types are the most important part of creating C++, so make sure the types are easily identifiable by the reader. As such, a very common convention is to name user-defined types with an initial uppercase letter and objects (variables/functions) with an initial lowercase letter.
While we are still on naming conventions:
pParent!. Putting type information into the name is not useful p for pointer. Hungarian Notation is considered bad practice. It handcuffs you to specific types and thus makes your code brittle. Also type information is expressed very clearly by the actual type of the object when declared.Order Of Member Initialization
The members of an object are initialized in the same order as declaration:
inline RBTree::Node::Node(const TKey & Key, const TData & Data, Node * Parent) :
_Color(Red), _Parent(Parent), _Link(), _Key(Key), _Data(Data)Your order here is not the same as the order of declaration. This is not going to cause an issue in this situation but it is untidy and bad practice (any you should have been warned about this by the compiler).
Always put the initializer list in the same order as declaration. That way, when it does matter you can easily spot it.
Also like normal variable declaration, please initialize one variable per line. Remember you are writing this code so that other people (that could be you in 6 months) can read it.
inline RBTree::Node::Node(const TKey & Key, const TData & Data, Node * Parent)
: _Color(Red)
, _Link()
, _Parent(Parent)
, _Key(Key)
, _Data(Data)
{}Don't declare variables until needed.
Node * pNode = _Root;
int Dir;The
pNode is not needed for 6 rows and Dir is not needed for 13 rows. For a lot of types this makes a difference as the constructor is code that will be executed. If you don't need those variables the don't declare them is the general rule. Also keeping the declarations a long way from the point of use makes it hard to verify the types of the variable.Self-documenting code
I am trying to read and understand the
Insert(). But it's not easy going. No documentation and the code is not written as self-documenting either.while (pNode->_Parent && pNode->_Parent != _Root && pParent->_Color == Red)That would be much more logical to read as something like:
while (grandParentNotBalanced(pNode, pParent)) // Or something.Code Snippets
Insert(K const& key, V const& value); // The one you implement
Insert(K&& key, V&& value); // Move the key and value.
template<Args...>
Insert(K const& key, Args... args); // Build the value in placeColor _Color;
Node * _Link[2];
Node * _Parent;
TKey _Key;
TData _Data;inline RBTree<TKey, TData>::Node::Node(const TKey & Key, const TData & Data, Node * Parent) :
_Color(Red), _Parent(Parent), _Link(), _Key(Key), _Data(Data)inline RBTree<TKey, TData>::Node::Node(const TKey & Key, const TData & Data, Node * Parent)
: _Color(Red)
, _Link()
, _Parent(Parent)
, _Key(Key)
, _Data(Data)
{}Node * pNode = _Root;
int Dir;Context
StackExchange Code Review Q#163353, answer score: 6
Revisions (0)
No revisions yet.