HiveBrain v1.2.0
Get Started
← Back to all entries
snippetcMinor

Implementing a Generic Quick Sort With Various Optimizations

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
genericwithquicksortimplementingvariousoptimizations

Problem

I decided for educational purpose to make a generic sort like the one in the standard library as an exercise for my self in learning more about how to do generic programming in C and so I would really appreciate it if you guys could tell me how I could have done a better job and what other various improvements I could add to make my code better. I have quite a bit of functions to go through so I'm going to explain what each function does as much as I need to.

Main

int main()
{
    /*
    Throughout this source code, if you see an unsigned int, more than likely 
    I'm using an unsigned int because I'm comparing to a variable of type size_t
    */
    unsigned int i;
    int a[] = {200, 1, 99, 23, 56, 207, 369, 136, 147, 21, 4, 75, 36, 12};

    size_t numberOfElements = sizeof(a)/sizeof(a[0]);

    /*
     No point in performing quick sort if it does not have more than 1 elements and if 
     the array is sorted.
    */
    if(numberOfElements > 1 && !isSorted(a, numberOfElements, sizeof(int), cmp))
    {
        quickSort_h(a, numberOfElements, sizeof(int), cmp);
    }

    for(i = 0; i < numberOfElements; i++)
    {
        printf("%d ", a[i]);
    }

    return 0;
}


Quick Sort Helper

void quickSort_h(void *base, size_t nitems, size_t memSize, int (*cmp)(const void *, const void *))
{
    quickSort(base, 0, nitems - 1, memSize, cmp);
    insertionSort(base, nitems, memSize, cmp);
}


Quick Sort

```
/*
Here we are doing a tail recurse optimization and using our threshold value(20) to know
when to move to Insertion Sort.
*/
void quickSort(void base, int first, int last, size_t memSize, int (cmp)(const void , const void ))
{
while(last - first > THRESHOLD)
{
int pivot;

pivot = qSortPartition(base, first, last, memSize, cmp);

/*
Here we are checking which is easier to recurse by checking if the subarray
on the left is shorter than the one on the right and vice versa.
*/

Solution

Don't guess at compatibility

/*
    Throughout this source code, if you see an unsigned int, more than likely 
    I'm using an unsigned int because I'm comparing to a variable of type size_t
    */
    unsigned int i;


Why not just use size_t?

size_t i;


As stands, this has the worst of both worlds. You are using unsigned int because it usually matches size_t. But it doesn't always match size_t. And it doesn't allow you to set i to negative values. So you may have to pay the conversion penalty if it doesn't match. And you have to accept the limitations of an unsigned type regardless. If you use size_t instead, you have to accept the limitations but at least you are guaranteed not to have to do a conversion.

Know your bounds

for(i = 0; i  0 && cmp(&carray[j * memSize], &carray[(j - 1) * memSize]) < 0)


Consider

for (i = 1; i  0 && cmp(&carray[j * memSize], &carray[(j - 1) * memSize]) < 0)


That saves one iteration of the outer loop without changing behavior. This is because the inner loop won't run when i and j are 0.

Don't subtract and multiply when you can just do one

char *carray = (char *)base;
    unsigned int i;
    int j;

    for(i = 0; i  0 && cmp(&carray[j * memSize], &carray[(j - 1) * memSize]) < 0)
        {
            byteSwap(&carray[j * memSize], &carray[(j - 1) * memSize], memSize);
            j--;
        }
    }


Consider

char *start = base;
    char *end = start + nitems * memSize;
    char *top;

    for (top = start + memSize; top  start && cmp(previous, current) > 0)
        {
            byteSwap(previous, current, memSize);

            previous = current;
            current -= memSize;
        }
    }


This version only does one explicit multiplication in the entire thing. Rather than working with indexes that need to be converted into pointers, this manipulates the pointers directly.

Note that &start[nitems memSize] is equivalent to start + nitems memSize. That will be true even for arrays of things that are larger than one byte per element. C compilers are smart enough that when they add an integer to a pointer, they automatically multiply by the width of the element. You wouldn't need memSize at all if casting to void * hadn't discarded the type information.

Also note that cmp and byteSwap are guaranteed to work with the same values now. The original version did the same operations twice each. Doing the same thing in parallel is risky. It allows for edits that only change something in one place that needs to be changed in two.

Another side effect of this is that we only calculate one value per iteration. The original version did at least two and possibly four absent compiler optimizations.

Declarations

Even in older C versions, you aren't limited to declaring variables at the start of a function. You are limited to the start of a block.

Similarly,

char tmp;
    char *aa = a, *bb = b;

    do
    {
        tmp = *aa;


could be

char *aa = a;
    char *bb = b;

    do
    {
        char tmp = *aa;


Note that this also changes the declarations so that all assignments are alone. It's easy to miss a declaration or assignment if more than one is done per line. A rule limiting to one declaration per line with an assignment increases readability.

Code Snippets

/*
    Throughout this source code, if you see an unsigned int, more than likely 
    I'm using an unsigned int because I'm comparing to a variable of type size_t
    */
    unsigned int i;
for(i = 0; i < nitems; i++)
    {
        j = i;
        while(j > 0 && cmp(&carray[j * memSize], &carray[(j - 1) * memSize]) < 0)
for (i = 1; i < nitems; i++)
    {
        j = i;
        while(j > 0 && cmp(&carray[j * memSize], &carray[(j - 1) * memSize]) < 0)
char *carray = (char *)base;
    unsigned int i;
    int j;

    for(i = 0; i < nitems; i++)
    {
        j = i;
        while(j > 0 && cmp(&carray[j * memSize], &carray[(j - 1) * memSize]) < 0)
        {
            byteSwap(&carray[j * memSize], &carray[(j - 1) * memSize], memSize);
            j--;
        }
    }
char *start = base;
    char *end = start + nitems * memSize;
    char *top;

    for (top = start + memSize; top < end; top += memSize)
    {
        char *previous = top;
        char *current = previous - memSize;

        while (previous > start && cmp(previous, current) > 0)
        {
            byteSwap(previous, current, memSize);

            previous = current;
            current -= memSize;
        }
    }

Context

StackExchange Code Review Q#140637, answer score: 2

Revisions (0)

No revisions yet.