HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppModerate

MergeSort performance

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
performancemergesortstackoverflow

Problem

Do you see any inefficient parts with the following code? On internet I see 30 ms for 100.000 integers but I can sort 100.000 integers in 300ms. The data is random. I am using late 2013 macbook pro so I don't expect a 10x slowdown from CPU, but who knows.

void mergesort2(int* list, int begin, int end,int *tmplist=0){
    bool allocated = false;
    if (end-begin  middle) {

            memcpy(tmplist+tmp, list+second, sizeof(int)*(end-begin-tmp+1));
        }
        else if (second > end) {
            memcpy(tmplist+tmp, list+first, sizeof(int)*(end-begin-tmp+1));

        }
        else if (list[first] <= list[second]) {
            tmplist[tmp] = list[first++];
        }else{
            tmplist[tmp] = list[second++];
        }
    }

    memcpy(list+begin, tmplist, (end-begin+1)*sizeof(int));

    if (allocated) {
        delete tmplist;
    }
}

Solution

A C++ merge sort in the style of a Standard Library algorithm could be written as

template::value_type>>
void merge_sort(BiDirIt first, BiDirIt last, Compare cmp = Compare())
{
        auto const N = std::distance(first, last);
        if (N < 2) return;
        auto middle = std::next(first, N / 2);
        merge_sort(first, middle, cmp);
        merge_sort(middle, last, cmp);
        std::inplace_merge(first, middle, last, cmp);
}


Let's look at how this improves upon your design:

  • generic interface: it takes two iterators and a comparison function of arbitrary types, instead of two indices into a pointer to an array of int and the implicit but fixed



  • in-place: behind the covers it calls std::inplace_merge that can allocate extra memory but it does it so that users don't have to supply this, instead of your tmplist parameter.



  • half-open intervals: it will split the input range [first, last) into [first, middle) and [middle, last) which automically takes care of easy to make "off-by-one" errors in code like middle + 1 (in your code it is correct, but you still have to think about it, half-open intervals are easier to get right).



  • named algorithms: it uses std::inplace_merge as a ready-to-use component instead of a somewhat long and certainly tricky final step in your algorithm. Not only makes this your merge_sort more compact, the inplace_merge itself is a resuable component in its own right!



  • heavily optimized: Standard Library experts are very likely to know about the state of the art in writing performance-optimized algorithms. A factor of 10 of performance compared to your code seems a lot, but if you forget some hidden copies, use a hand-written loop that has O(N^2) complexity instead of O(N log N)` you can quickly get there.

Code Snippets

template<class BiDirIt, class Compare = std::less<typename std::iterator_traits<BiDirIt>::value_type>>
void merge_sort(BiDirIt first, BiDirIt last, Compare cmp = Compare())
{
        auto const N = std::distance(first, last);
        if (N < 2) return;
        auto middle = std::next(first, N / 2);
        merge_sort(first, middle, cmp);
        merge_sort(middle, last, cmp);
        std::inplace_merge(first, middle, last, cmp);
}

Context

StackExchange Code Review Q#43138, answer score: 14

Revisions (0)

No revisions yet.