HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Binary heap with O(1) lookup and O(log n) change key

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
logwithheapbinaryandlookupchangekey

Problem

I needed a priority queue that allowed efficient searching and changing of keys. I tried std::priority_queue, but the interface was extremely limiting. Not being able to search for existence wasn't a big deal since I could separately keep a std::unordered_set around, but not being able to modify key value was a deal breaker.

I next sampled a std::vector maintained with std::make_heap and friends, but finding the element took \$\mathcal{O}(n)\$ time and changing the key value took \$3n\$ operations since I had to std::make_heap the entire sequence.

Here is my binary heap with these guarantees:

  • \$\Theta(1)\$ time search



  • \$\Theta(n)\$ time construction and batch insert



  • \$\Theta(\log n)\$ time insert



  • \$\Theta(\log n)\$ time extract top



  • \$\Theta(\log n)\$ time change key (and check key if value was indirectly changed)



```
#include
#include
#include
#include
#define SENTINEL(T) (T{})

namespace sal {

// by default maxheap where parent greater than children
template >
class Heap {
// default representation is as a vector, index at 1 so need 1 extra element
std::vector elems;
// associate each key with an index
std::unordered_map index;
// comparator for sorting in a maxheap
Cmp cmp;
size_t parent(size_t i) {return i >> 1;}
size_t left(size_t i) {return i > 1; i != 0; --i)
heapify(i);
}
// adjusts elems[i] assuming it's been modified to be smaller than its children
// runs in O(lgn) time, floats elems[i] down
void heapify(size_t i) {
while (true) {
size_t l = left(i), r = right(i);
size_t largest = i;
// largest of elems[i], elems[l], and elems[r]
if (l ::iterator;
using const_iterator = typename std::vector::const_iterator;
// default T value as sentinel

// construction ------------
// O(n) time construction via these constructors (would be O(nlgn) for repeated inserts)
Heap(Cmp& c) : elems{T{

Solution

It looks like the public API of Heap is

bool empty() const
bool size() const
T top() const
size_t key(T) const
T extract_top()
void insert(T)
template void batch_insert(Iter, Iter)
void change_key(size_t, T)
void check_key(size_t)
bool is_maxheap() const
bool is_minheap() const
bool correct_index() const


Some of these names match the STL containers very well (e.g. empty() and size()), but some of them are completely opaque to me (e.g. is_maxheap(), correct_index()). It would be nice if Heap behaved more like an STL container, wherever possible.

The existence of T top() const implies that Heap cannot be used with non-copyable Ts, and is inefficient even for copyable Ts. It would make more sense to provide a top() method along the same lines as std::queue::front():

T& top();
const T& top() const;


In the iterator class:

void operator++() {++cur;}


Of course you mean

auto& operator++() { ++cur; return *this; }
auto operator++(int) { auto result = *this; ++*this; return result; }


typename Cmp = std::greater


In C++14, it would be preferable to specify Cmp = std::greater<>. I don't know of any best-practices way of saying "use greater<> but degrade gracefully to greater in C++11."

batch_insert(Iter, Iter) sounds an awful lot like what the STL calls insert(Iter, Iter). Function overloading does suck, but inconsistency sucks more. Prefer to copy the STL's naming convention in this case.

In your version "for non-POD T": This should be your only version. The POD-ness of T isn't important; what's important is whether the type is expensive to copy or not.

TP actual_value {new T{v}};


You should be using smart pointers; i.e., this should be auto actual_value = make_unique(v). Otherwise, you risk memory leaks if any of the later lines in this function throw an exception.

Also, you should be using perfect forwarding (std::forward(v)) to eliminate unnecessary copies in the case that v is copyable, and to eliminate compiler diagnostics in the case that v is non-copyable.

Speaking of move semantics,

Heap(Cmp& c) : elems{T{}}, cmp(c) {}
Heap(initializer_list l, Cmp&& c = Cmp{}) : cmp(c) { ... }


These are both incorrect. What you meant is

Heap(Cmp c) : elems{T{}}, cmp(std::move(c)) {}
Heap(initializer_list l, Cmp c = Cmp{}) : cmp(std::move(c)) { ... }


Specifying the parameter as Cmp& means that the user can't pass in a reference to a const lvalue, nor can he pass in an rvalue expression.

Specifying the parameter as Cmp&& means that the user can't pass in any kind of lvalue (const or not), which means the user can't pass in any kind of named object.

What you wanted was simply Cmp, which means "I don't care what the user gives me, as long as it's possible to (implicitly) construct a Cmp from it." Once we've got that Cmp object on the stack, the next thing we do is obviously to std::move it into the right place.

swap(elems[1],elems.back());


This should be

using std::swap;
swap(elems[1],elems.back());


Otherwise, the code won't even compile for something like Heap, because there's no such thing as a free function swap(int, int). You have to make sure that name lookup will correctly find std::swap if ADL fails.

(If it's been working for you because you using namespace std; at file scope before including this header, then your code is bad and you should feel bad.)

Stylistically, the code is a bit dense, especially with all those confusingly-named public member functions. I didn't spend any time trying to puzzle out what correct_index() was meant to do, for example. I would recommend putting a single blank line between member function definitions; and making sure member functions are private when appropriate; and making sure that member functions' names match the STL conventions whenever possible.

Finally: Isn't the "change_key" operation just the same thing as removing the pair {key1, value1} and then inserting the new pair {key2, value1}? What's the point of providing it as a primitive operation?— can you actually do it more efficiently than "remove and re-insert"?

Code Snippets

bool empty() const
bool size() const
T top() const
size_t key(T) const
T extract_top()
void insert(T)
template<class Iter> void batch_insert(Iter, Iter)
void change_key(size_t, T)
void check_key(size_t)
bool is_maxheap() const
bool is_minheap() const
bool correct_index() const
T& top();
const T& top() const;
void operator++() {++cur;}
auto& operator++() { ++cur; return *this; }
auto operator++(int) { auto result = *this; ++*this; return result; }
typename Cmp = std::greater<T>

Context

StackExchange Code Review Q#75539, answer score: 4

Revisions (0)

No revisions yet.