HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

In-memory B-Tree in C++11

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
treememorystackoverflow

Problem

I just wrote a simple B-Tree insertion implementation, and was wondering if anyone can comment / critique on code style, readability, maintainability etc. I tested it with a few small test cases and as far as I know, there are no obvious bugs in this.

I'm making extensive use of shared_ptr. I contemplated using unique_ptr, but since I'm maintaining parent pointers in every node, I settled with shared_ptr. I'm deriving the Node class from enable_shared_from_this because I want to be able to construct a shared_ptr on this pointer from within a function of the Node class. Is this a good pattern?

These are some ideas I got while writing this code (this may help you with the review):

  • A leaf node is always split into leaf nodes, similarly internal nodes that are split give more internal nodes.



  • A node with parent pointer null can be treated as root.



  • The SplitNode() function returns the sibling (a new node) and the median. Whenever a node is split, the median moves up. But all the children stay at the same level and get divided between the two nodes.



  • PropagateUp() receives the median key from one of its children and points to the split children.



  • Finally, I'm always inserting the new key into the node, and then splitting the Node if it exceeds Max. I found this to be easier to implement than to detect if the node is going to be full because of the incoming key, and then calculate the median without inserting - which could either be the new incoming value, or from the existing keys.



Since there are so many shared_ptrs, the performance must suck. Are there any suggestions on that front?

```
#include
#include
#include
#include
#include

int MAXKEYS = 5; // we split at max keys.
struct Node : std::enable_shared_from_this
{
Node(bool isLeaf = false)
: _isLeaf(isLeaf),
_parent(nullptr)
{
}

void splitNode(std::shared_ptr& siblingNode /outParam/, int& median /outParam/)
{
size_t medianOffs

Solution

I'm just going to look at the code, not the correctness of the algorithm.

On the plus side:

  • You have used a consistent system for labelling functions and variables.



  • Your code is clean and well laid out, its very easy to read.



  • You have been consistent with your use of braces.



And now the negative:

  • The comments need improvements, saying when a function is called is


OK, but why don't you say what it does and why too.

  • Why are you using structs instead of classes.



  • MAXKEYS - This looks like a constant, but it isn't coded as such.



  • Its a signed value being assigned to an unsigned value.



  • Its an odd number being divided by 2 and then integer rounded.



  • Why divide by two when you could bitshift?



  • If parameters to functions are not being modified then the intention of the author is clearer when you make them const.



  • If functions don't modify class data then make them const.



  • medianOffset should be const. It makes it clearer that you don't intend it to change.



  • Use typedef to define the structure that holds the data (std::shared_ptr<>) that way you can change it by just editing one place.



  • You need to program more defensively, you need to check the parameters to public functions are acceptable before you start processing.



  • Have you considered what happens if '_children' is empty when you call '_children.begin()' in splitNode.



If there is anything I've not explained enough please shout. Hope that helps.

Context

StackExchange Code Review Q#106210, answer score: 3

Revisions (0)

No revisions yet.