HiveBrain v1.2.0
Get Started
← Back to all entries
snippetjavaMinor

Handling duplicate keys in quicksort

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
handlingduplicatekeysquicksort

Problem

A naïve quicksort will take O(n^2) time to sort an array containing no unique keys, because all keys will be partitioned either before or after the pivot value. There are ways to handle duplicate keys (like one described in Quicksort is Optimal). The proposed solution only works for the Hoare partition, but I've implemented the Lomuto partition. To deal with duplicate keys, I alternated between moving duplicates to the left of the pivot and moving duplicates to the right of the pivot. Here's my quicksort (ignore the lack of generics):

public static void swap(Comparable[] sort, int a, int b){
    Comparable temp=sort[a];
    sort[a]=sort[b];
    sort[b]=temp;
}
public static void quicksort(Comparable[] sort, int start, int end){
    while(end-start>1){//normal case
        int pivot=gen.nextInt(end-start)+start;//random pivot
        swap(sort, pivot, start);
        int index=start;//walking index
        boolean dupHandler=false;//init to true works also
        for(int i=start+1; i0 || ( val==0 && (dupHandler=!dupHandler) ) )
                swap(sort, ++index, i);
        }
        swap(sort, start, index);
        if(index-start<end-index-1){//recurse into smaller partition
            quicksort(sort, start, index);
            start=index+1;//use iteration for other partition
        }
        else{
            quicksort(sort, index+1, end);
            end=index;
        }
    }
}


Is there a better (more efficient) way to handle duplicate keys? If not, is there any way to significantly improve my code?

Solution


  • Use a swap function rather then repeating the three lines required to swap elements three times



  • Put spaces around your operators



  • You avoid recursing for the second partition. I don't think the performance gain from this is sufficient for the extra complexity in your code.



Rather then swapping before/after for the element equal to the pivot, count the number of elements equal to the pivot. Then when you sort the subarray make the one subarray shorter so as not to sort the elements equal to the pivot.

Another issue as Rafe has noted is that there are other issues besides duplicate keys that can cause a O(n^2) behavior. Your code doesn't handle any of those.

Context

StackExchange Code Review Q#3889, answer score: 4

Revisions (0)

No revisions yet.