HiveBrain v1.2.0
Get Started
← Back to all entries
snippetcppMinor

Modern, C++ compliant, merge sort

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
compliantsortmodernmerge

Problem

This is a kind of follow up to my previous sort implementation review with specific questions to merge sort, and Modern C++ idioms.

template::value_type >>
void merge(InputIt1 first1,
           InputIt1 last1,
           InputIt2 first2,
           InputIt2 last2,
           OutputIt out,
           Comparator cmp = Comparator())
{
    while (first1 != last1 || first2 != last2)
    {
        if (first2 == last2 || first1 != last1 && cmp(*first1, *first2))
        {
            *out = *first1++;
        }
        else
        {
            *out = *first2++;
        }
        ++out;
    }
}

template::value_type >>
void sort_merge(BidirIt first,
                BidirIt last,
                Comparator cmp = Comparator())
{
    typedef std::iterator_traits::difference_type dt;
    typedef std::iterator_traits::value_type vt;

    dt size = std::distance(first, last);
    if (size > dt(1))
    {
        auto middle = first + size / dt(2);

        std::vector left(first, middle);
        std::vector right(middle, last);

        auto left_begin = left.begin(), left_end = left.end();
        auto right_begin = right.begin(), right_end = right.end();

        sort_merge(left_begin, left_end);
        sort_merge(right_begin, right_end);

        ::merge(left_begin, left_end, right_begin, right_end, first);
    }
}


I am looking for any performance optimizations I may have missed in the algorithm itself, as well as Modern C++ constructs I may not be abiding by for this example.

I looked into in-place merging, however, all of the solutions I have found for in-place merging perform worse. Namely, substituting a sort_insertion for the merge operation. However, if I were to make an in-place sort_merge, I would start a separate topic as it's considered a completely separate algorithm.

Solution

Prefer to use move rather than copy semantics

If your object is an int then it makes no difference. But some objects are more expensive to copy (such as std::string). If you can move rather than copy you will get much better performance with more complex objects.

*out = *first1++;

// Just add a call to std::move
*out = std::move(*first1++);


Too many lines of parameters

This is bit difficult to say this is required or perfect and you can make arguments both ways. But one parameter per line seems a bit over verbose. I prefer to group my parameters into logical groups (tools don't like this as they can't apply logical as it is situation specific).

But I personally think that for readability it makes sense to put parameters in logical groups.

(see below)

Member types names

Technically this is illegal:

typedef std::iterator_traits::difference_type dt;


The compiler can not tell if difference_type is a type or a member variable until it knows the type of BidirIt. So you need to tell it that it is a typename.

typedef typename std::iterator_traits::difference_type dt;
 //     ^^^^^^^^  required for standards conformance


Type names

Types are important in C++. Try and use names that help identify the type (be a bit verbose). But also it is a sort of convention that user defined types have an initial upper case letter (so that they can be easily spotted from objects).

typedef std::iterator_traits::difference_type dt;

// Here I did not like the short name:
typedef typename std::iterator_traits::difference_type DiffType;


Don't pass comparator to merge by value

The merge operation is not supposed to be done in isolation. It is always called from the sort_merge() function. So there is no need to create a comparator for each call to merge() just create it once in sort_merge() and pass by const reference to all calls to merge():

void merge(InputIt1 first1, InputIt1 last1,
           InputIt2 first2, InputIt2 last2,
           OutputIt out,
           Comparator const& cmp)


Non optimal implementation

while (first1 != last1 || first2 != last2)
{
    if (first2 == last2 || first1 != last1 && cmp(*first1, *first2))
    {
        *out = *first1++;
    }
    else
    {
        *out = *first2++;
    }
    ++out;
}


Better to write as this:

// Test your constraints here.
while (first1 != last1 && first2 != last2)
{
    // The comparison test then becomes much simpler.
    *out = std::move(cmp(*first1, *first2)
                         ? *first1++
                         : *first2++);
    ++out;
}
// Then the final optimized copy
// Only one of these loops will execute.
std::move(first1, last1, out);
std::move(first2, last2, out);


Space optimization

Creating the copy in sort_merge()

std::vector left(first, middle);
    std::vector right(middle, last);


Means that you use more memory because you have a copy in this iteration then make a copy in each recursive call below this one and they are all active at the same time.

If you do the recursive sort_merge() in place. Then in the merge() call perform the copy then you get a much better space usage (as you only have one copy at the current iteration active).

Result

You will notice I have slightly altered the merge. We can make several assumptions about the inputs (because it is only called from sort_merge()):

  • There is only one type of input iterator.



  • Because we know we are sorting two halves of the same container.



  • There is no output iterator



  • as the merge is done in place.



  • The comparator does not need a default type.



  • this is done as part of the sort_merge template.



  • I don't pass two ranges.



  • I use begin/mid/end



Code: merge

template
void merge(I first, I mid, I last, Comparator const& cmp)
{
    typedef typename std::iterator_traits::difference_type DiffType;
    typedef typename std::iterator_traits::value_type      ValueType;

    DiffType                size1 = std::distance(first, mid);
    DiffType                size2 = std::distance(mid, last);

    std::vector  hold1;
    std::vector  hold2;
    hold1.reserve(size1);
    hold2.reserve(size2);

    std::move(first, mid,  std::back_inserter(hold1));
    std::move(mid,   last, std::back_inserter(hold2));

    auto first1 = std::begin(hold1);
    auto last1  = std::end(hold1);
    auto first2 = std::begin(hold2);
    auto last2  = std::end(hold2);
    auto out    = first;

    while (first1 != last1 && first2 != last2)
    {
        *out = std::move(cmp(*first1, *first2)
                             ? *first1++
                             : *first2++);
        ++out;
    }
    std::move(first1, last1, out);
    std::move(first2, last2, out);
}


Code: sort_merge

```
template::value_type>>
void sort_merge(I first,
I last,
Comparator cmp = Comparator())
{
typedef typename std::iterator_traits::difference_type DiffType;

DiffType size = std::

Code Snippets

*out = *first1++;

// Just add a call to std::move
*out = std::move(*first1++);
typedef std::iterator_traits<BidirIt>::difference_type dt;
typedef typename std::iterator_traits<BidirIt>::difference_type dt;
 //     ^^^^^^^^  required for standards conformance
typedef std::iterator_traits<BidirIt>::difference_type dt;

// Here I did not like the short name:
typedef typename std::iterator_traits<BidirIt>::difference_type DiffType;
void merge(InputIt1 first1, InputIt1 last1,
           InputIt2 first2, InputIt2 last2,
           OutputIt out,
           Comparator const& cmp)

Context

StackExchange Code Review Q#88298, answer score: 7

Revisions (0)

No revisions yet.