HiveBrain v1.2.0
Get Started
← Back to all entries
snippetcppModerate

Insertion Sort in C++

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
insertionsortstackoverflow

Problem

I have created an Insertion Sort code that sorts numbers (obviously).

#include 
#include 

void insertion_sort(int arr[], int length);
void print_array(int array[],int size);

int main()
{
    int array[] = {4,6,3,7,5,9,2,8,1,10};
    insertion_sort(array,10);
    print_array(array,10);
    return 0;
}

void insertion_sort(int arr[], int length)
{
    int i,j,tmp;
    for (i = 1; i  0 && arr[j - 1] > arr[j]) {
            tmp = arr[j];
            arr[j] = arr[j - 1];
            arr[j - 1] = tmp;
            j--;
        }
    }
}

void print_array(int array[], int size)
{
    int j;
    for (j = 0; j < size; j++)
    {
        std::cout << array[j] << std::endl;
    }
}


Is there a way to make it shorter or better?

Solution

This is not C++, yet. Just using std::cout does not C++ make. But do not fret, we'll get there.

Safe interface

Whenever designing an API (or interface), you should strive to make it as safe as possible.

Your interface here is not safe: having two un-linked variables to correctly pass is asking for troubles; it is just too easy to mix the variables up and pass a size that does not correspond to the array to be sorted (or printed).

Obviously, in your toy example it does not show up, however as soon as you manipulate multiple arrays in the same function you'll be very likely to make this mistake. And even with a single array, passing the length before re-allocating (for example) is also likely.

So, you need to aim for an interface that conveys both pieces of information in a single argument. An obvious choice is std::vector, another would be some instance of array_view or a random_access_range but those are not Standard so let's not bother for now.

Thus, the interface becomes:

void insertion_sort(std::vector& array);
void print(std::vector const& array);


Do note the consistency in the naming; both times the argument is named array instead of being once arr and once array. Consistency may be the hobgobelin of little minds, but it does help readers.

Algorithmic

An insertion sort is generally described simply as: read each element in turn, and insert it in the right position among the previous read (and sorted) elements.

The cost of the algorithm is thus:

  • for each of the N elements



  • the cost of inserting it in the i-1 (where i is the index of the element being read) first elements



The outer loop we cannot do anything about, but what of the insertion?

In your code, the insertion is done by:

while (j > 0 && arr[j - 1] > arr[j]) {
        tmp = arr[j];
        arr[j] = arr[j - 1];
        arr[j - 1] = tmp;
        j--;
    }


That is, the value is bubbling up toward the front until it finds its spot.

This is suboptimal in terms of comparisons: the optimal algorithm to find the position to insert is using a binary search rather than a linear search, for O(log N) complexity rather than O(N) one.

In terms of number of moves, well, you may have to move all elements, so it could only be improved by a constant factor anyway.

Standing on the shoulders of giants, we can improve the code by separating the search and the move, and use pre-existing algorithms to do so:

  • std::upper_bound yields a pointer to the first element in the range that is greater than the value we search with



  • std::rotate rotates the elements of the range (and middle point) passed to it, such that the range delimited by the last two elements come to precede the range delimited by the first two



We can use them keeping to indices, although it is slightly verbose:

void insertion_sort(std::vector& array) {
    typedef std::vector::iterator It;

    for (size_t i = 1, length = array.size(); i < length; ++i) {
        // 1. Search
        It const end_sorted = array.begin() + i;
        It const insertion_point =
            std::upper_bound(array.begin(), end_sorted, array[i]);

        // 2. Insert
        std::rotate(insertion_point, end_sorted, end_sorted + 1);
    }
}


Thus I advise jumping the gun and just switch to iterators since that it what the interfaces of the algorithms we use consume:

void insertion_sort(std::vector& array) {
    typedef std::vector::iterator It;

    for (It it = array.begin(), end = array.end(); it != end; ++it) {
        // 1. Search
        It const insertion_point =
            std::upper_bound(array.begin(), it, *it);

        // 2. Insert
        std::rotate(insertion_point, it, it + 1);
    }
}


Which in modern C++ would use auto rather than explicitly naming the iterator type:

void insertion_sort(std::vector& array) {
    for (auto it = array.begin(), end = array.end(); it != end; ++it) {
        // 1. Search
        auto const insertion_point =
            std::upper_bound(array.begin(), it, *it);

        // 2. Insert
        std::rotate(insertion_point, it, it + 1);
    }
}


Note that abbreviating further is not a good idea; while one-liners may give the writer a fuzzy warm feeling, they just end up being head scratchers for maintainers:

// Puzzle time!
void insertion_sort(std::vector& array) {
    for (auto it = array.begin(), end = array.end(); it != end; ++it) {
        std::rotate(std::upper_bound(array.begin(), it, *it), it, it + 1);
    }
}


What does endl do?

Finally, a trick question: what do think std::endl is and does?

std::endl is a so called stream manipulator, it is a function whose signature is:

template
std::basic_ostream& endl( std::basic_ostream& os );


So, when used on a stream, it does an operation on said stream and returns it (to allow further chaining).

Specifically, `ostream

  • the buffer was introduced for performance reasons, bypassing it is asking to go slow



So, not onl

Code Snippets

void insertion_sort(std::vector<int>& array);
void print(std::vector<int> const& array);
while (j > 0 && arr[j - 1] > arr[j]) {
        tmp = arr[j];
        arr[j] = arr[j - 1];
        arr[j - 1] = tmp;
        j--;
    }
void insertion_sort(std::vector<int>& array) {
    typedef std::vector<int>::iterator It;

    for (size_t i = 1, length = array.size(); i < length; ++i) {
        // 1. Search
        It const end_sorted = array.begin() + i;
        It const insertion_point =
            std::upper_bound(array.begin(), end_sorted, array[i]);

        // 2. Insert
        std::rotate(insertion_point, end_sorted, end_sorted + 1);
    }
}
void insertion_sort(std::vector<int>& array) {
    typedef std::vector<int>::iterator It;

    for (It it = array.begin(), end = array.end(); it != end; ++it) {
        // 1. Search
        It const insertion_point =
            std::upper_bound(array.begin(), it, *it);

        // 2. Insert
        std::rotate(insertion_point, it, it + 1);
    }
}
void insertion_sort(std::vector<int>& array) {
    for (auto it = array.begin(), end = array.end(); it != end; ++it) {
        // 1. Search
        auto const insertion_point =
            std::upper_bound(array.begin(), it, *it);

        // 2. Insert
        std::rotate(insertion_point, it, it + 1);
    }
}

Context

StackExchange Code Review Q#110793, answer score: 10

Revisions (0)

No revisions yet.