patterncppMinor
Generic Multiway Merge
Viewed 0 times
genericmultiwaymerge
Problem
Recently I had a problem (not in C++) that required merging a number of sorted lists of values. This is effectively the generalised case of merging together two sorted lists. However, instead of merging them into one single list, the requirement was instead to effectively put them into "buckets" of like values. As an example, given:
{1, 2, 3}, {2, 3, 4}, {1, 3, 5}
They should go into "buckets" of like values:
{1, 1}, {2, 2}, {3, 3, 3}, {4}, {5}
I've rewritten it in C++ for fun:
```
#include
#include
#include
#include
#include
// Given an pair of Iterators to an outer container that has a nested
// container, such as std::vector>, where the values in each
// std::deque are sorted in ascending order, will return a vector of
// Iterators where the first element is an (equal) minimum over all the containers.
// For example, given:
// std::vector> = { {1, 2, 3}, {2, 3, 4}, {1, 4, 5} };
// This would return iterators to the 1st and 3rd elements, as both have
// the equal minimum (here, 1) in the first position.
template
std::vector minimums(Iterator begin, Iterator end)
{
std::vector mins;
if(begin == end) { return mins; }
while(begin != end && begin->begin() == begin->end()) { ++begin; }
if(begin == end) { return mins; }
mins.push_back(begin);
++begin;
for(auto it = begin; it != end; ++it) {
const auto& current_min = std::begin(mins[0]);
if(*(it->begin()) begin()) == current_min) {
mins.push_back(it);
}
}
return mins;
}
template
void erase_front(std::list& l) { l.pop_front(); }
template
void erase_front(std::deque& d) { d.pop_front(); }
// Performs a destructive multiway merge.
// The begin and end Iterators should reference a container with nested
// containers, where each nested container is sorted in ascending order.
// Further, the nested containers must support erasure from the front,
// so a std::deque is suggested here for optimal speed.
// To use a container that
{1, 2, 3}, {2, 3, 4}, {1, 3, 5}
They should go into "buckets" of like values:
{1, 1}, {2, 2}, {3, 3, 3}, {4}, {5}
I've rewritten it in C++ for fun:
```
#include
#include
#include
#include
#include
// Given an pair of Iterators to an outer container that has a nested
// container, such as std::vector>, where the values in each
// std::deque are sorted in ascending order, will return a vector of
// Iterators where the first element is an (equal) minimum over all the containers.
// For example, given:
// std::vector> = { {1, 2, 3}, {2, 3, 4}, {1, 4, 5} };
// This would return iterators to the 1st and 3rd elements, as both have
// the equal minimum (here, 1) in the first position.
template
std::vector minimums(Iterator begin, Iterator end)
{
std::vector mins;
if(begin == end) { return mins; }
while(begin != end && begin->begin() == begin->end()) { ++begin; }
if(begin == end) { return mins; }
mins.push_back(begin);
++begin;
for(auto it = begin; it != end; ++it) {
const auto& current_min = std::begin(mins[0]);
if(*(it->begin()) begin()) == current_min) {
mins.push_back(it);
}
}
return mins;
}
template
void erase_front(std::list& l) { l.pop_front(); }
template
void erase_front(std::deque& d) { d.pop_front(); }
// Performs a destructive multiway merge.
// The begin and end Iterators should reference a container with nested
// containers, where each nested container is sorted in ascending order.
// Further, the nested containers must support erasure from the front,
// so a std::deque is suggested here for optimal speed.
// To use a container that
Solution
My immediate reaction would be to accept a collection of ranges of some sort (e.g., Boost Range, or even just a pair of iterators).
Since the range (or iterator) types would be template parameters, the user can pass normal iterators or move iterators as s/he sees fit, so one piece of code suffices for both copying (non-destructive) and moving (destructive) merges.
I'd expect this to give sort of a half-way point between the two current pieces of code. It would probably be about the speed of the current destructive version, but potentially use more memory--it can move objects from the original container, but doesn't delete the source items from the original container as their moved to the output, so it will consume more memory than your existing destructive merge.
Memory usage will depend on whether you're dealing with objects that store most data in the object itself, or mostly store a pointer in the object and put most of the real data on the heap. In the latter case, moving out of the object will typically just move the pointer to the data into the new object, and leave only the minimal bookkeeping information in the old object, so it won't consume a lot more than the code above that deletes each item from the source container as it's moved to the destination collection.
Since the range (or iterator) types would be template parameters, the user can pass normal iterators or move iterators as s/he sees fit, so one piece of code suffices for both copying (non-destructive) and moving (destructive) merges.
I'd expect this to give sort of a half-way point between the two current pieces of code. It would probably be about the speed of the current destructive version, but potentially use more memory--it can move objects from the original container, but doesn't delete the source items from the original container as their moved to the output, so it will consume more memory than your existing destructive merge.
Memory usage will depend on whether you're dealing with objects that store most data in the object itself, or mostly store a pointer in the object and put most of the real data on the heap. In the latter case, moving out of the object will typically just move the pointer to the data into the new object, and leave only the minimal bookkeeping information in the old object, so it won't consume a lot more than the code above that deletes each item from the source container as it's moved to the destination collection.
Context
StackExchange Code Review Q#46430, answer score: 2
Revisions (0)
No revisions yet.