HiveBrain v1.2.0
Get Started
← Back to all entries
patterncMinor

ANSI C Mergesort

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
ansimergesortstackoverflow

Problem

This is a working mergesort implementation I wrote as I'm trying to get back into C.

I am not so much interested in feedback on the optimality of the algorithm (as I could read up countless articles I'm sure), but deep criticism of my style would be greatly appreciated.

```
#include
#include
#include

void mergesort_recurse(int* in_array, int i, int j);
void merge(int* array, int left, int centre, int right);

void mergesort(int* in_array, int len) {
mergesort_recurse(in_array, 0, len);
}

void mergesort_recurse(int* in_array, int left, int right) {

int length = right-left;

/*Base case: There are only two elements - Compare them and swap if
necessary.*/
if(length == 2) {
if(in_array[left] > in_array[right-1]) {
int temp = in_array[right-1];
in_array[right-1] = in_array[left];
in_array[left] = temp;
}
}

/Leaves of lenth 1 can be ignored as they are already "sorted"/
else if(length > 1){

/Split into two and sort recursively./
int centre = left + length / 2;

mergesort_recurse(in_array, left, centre);
mergesort_recurse(in_array, centre, right);
merge(in_array, left, centre, right);
}
}

void merge(int* array, int left, int centre, int right) {

/*Establish the initial indexes of the left and right sides of the two
"halves" to be merged.*/
int l = left;
int r = centre;
int length = right-left;

/*A temporary array to hold the merged data - this implementation is not
in-place*/
int tmp = malloc(length sizeof(int));

/Iterate over the length of the merge./
int i;
for(i = 0; i < length; i++) {

if(r==right){
/*If the right index has reached the right hand end of the data, the
remaining elements can be dumped in from the LHS.*/
memcpy(tmp+i, array+l, (length-i)*sizeof(int));
break;
}

else if (l==centre) {

Solution

I can't fault too much here, it's generally looking pretty reasonable. I think the first point below is the most important, depending on how much experience you have with C, the second may or may not make much sense.

Information Hiding

Firstly, you should split this up into separate .h and .c files; perhaps you've already done this but just pasted them together for this post. Either way, if you want to reuse this, you'll need to put your function prototypes into a header file to #include.

When you do this, you should give internal only functions (that is, functions you don't want your users to call directly) internal linkage. Hence your two functions:

void mergesort_recurse(int* in_array, int i, int j);
void merge(int* array, int left, int centre, int right);


should be prefaced with static:

static void mergesort_recurse(int* in_array, int i, int j);
static void merge(int* array, int left, int centre, int right);


So your header will look something like:

#ifndef MERGESORT_H_
#define MERGESORT_H_

void mergesort(int *in_array, int len);

#endif //MERGESORT_H_


And the static function prototypes go in the .c file:

static void mergesort_recurse(int* in_array, int i, int j);
static void merge(int* array, int left, int centre, int right);


This prevents anyone but the implementer from calling merge or mergesort_recurse. This allows you to safely modify any static functions and know that they won't be utilized in any user code - their interfaces can freely change and it won't break things - they're effectively private functions, in object-oriented terms.

Generic Sorting

You say you're trying to get back into C, and I'm not sure how well you knew the language before, but generally things like sorting routines will be generic. This can be somewhat complex, but I'll run through what you'd need to change if you wanted to allow sorting any type.

Firstly, all our int become void . We'll also need to add a parameter specifying the size of the type we're actually passing in.

void mergesort(void *in_array, int len, size_t el_size);


Next step is to abstract away how comparison is done: instead of simply using `, we'll pass in a function pointer which we can use as a comparison function: this has the signature:

int (*compare)(const void*, const void*);


That's pretty ugly though, so let's make a
typedef for it:

typedef int (*cmp_t)(const void*, const void*);


We'll also do the same thing for swapping two values: this will be used to replace

int temp = in_array[right-1];
in_array[right-1] = in_array[left];
in_array[left] = temp;


Again, with a
typedef:

typedef void (*swap_t)(void *, void *);


Our header now looks like:

#ifndef MERGESORT_H_
#define MERGESORT_H_

typedef int (*cmp_t)(const void*, const void*);
typedef void (*swap_t)(void *, void*);

void mergesort(void *in_array, int len, size_t el_size, cmp_t, swap_t);

#endif //MERGESORT_H_


And our in
.c:

static void mergesort_recurse
(void* in_array, int i, int j, size_t el_size, cmp_t, swap_t);

static void merge(void *array, int left, int centre, int right, size_t el_size, cmp_t);


The implementation can is still fairly similar, but working with
void* makes things a bit messier. Instead of if(in_array[left] > in_array[right-1]) we'll need to call our comparison function, and do a bunch of mucking around with casts:

compare(((char *)in_array + (left * el_size)), ((char *)in_array + (right - 1)*el_size)) > 0)


Hrm, yuck, that's pretty ugly. Let's make a function that can do all of this casting and index calculating for us:

In our
.c file:

static void* get_pos(void *array, int index, size_t el_size)
{
    return ((char *)array + (index * el_size));
}


The only other thing that needs to be changed in this function is passing through function pointers to the other functions:

mergesort_recurse(in_array, left, centre, el_size, compare, swap);
mergesort_recurse(in_array, centre, right, el_size, compare, swap);
merge(in_array, left, centre, right, el_size, compare);


In
merge, all the calls to memcpy will have to change a little bit, for example:

memcpy(get_pos(tmp, i, el_size), get_pos(array, r, el_size), (length-i) * el_size);


Doing all of this allows the user to sort whatever they want - integers, doubles, structs, whatever - as long as they provide their own comparison and swap functions:

int compare_i(const void *a, const void *b)
{
    int x = *((int *)a); 
    int y = *((int *)b);
    if(x == y) return 0;
    if(x < y) return -1;
    return 1;
}

void swap_i(void *a, void *b)
{
    int temp = *(int *)a;
    *((int *)a) = *((int *)b);
    *((int *)b) = temp;
}


So our
mergesort` call would look something like:

int x[] = {5, 8, 2, 3, 1, 7, 10};
mergesort(x, 7, sizeof(int), compare_i, swap_i);


If the above is all very confusing, I apologise, but hopefully it'll give you so

Code Snippets

void mergesort_recurse(int* in_array, int i, int j);
void merge(int* array, int left, int centre, int right);
static void mergesort_recurse(int* in_array, int i, int j);
static void merge(int* array, int left, int centre, int right);
#ifndef MERGESORT_H_
#define MERGESORT_H_

void mergesort(int *in_array, int len);

#endif //MERGESORT_H_
static void mergesort_recurse(int* in_array, int i, int j);
static void merge(int* array, int left, int centre, int right);
void mergesort(void *in_array, int len, size_t el_size);

Context

StackExchange Code Review Q#21036, answer score: 9

Revisions (0)

No revisions yet.