HiveBrain v1.2.0
Get Started
← Back to all entries
patternModerate

Dual-pivot Quicksort reference implementation?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
referencepivotquicksortdualimplementation

Problem

Has some sort of canonical - or reference - implementation of Dual-pivot Quicksort been posted anywhere?

I would like to include that algorithm in a comparison among sorting algorithms for a specialized need that I have, but the Java versions I've seen appear to have various kinds of tweaks applied to them, like using Insertion Sort for small (sub-) arrays, which makes it harder to compare the fundamentals.

Solution

I tried to do exactly such a comparison in my master thesis, which thus naturally includes pseudo-code of “basic” versions of several dual-pivot Quicksorts (there is a list of them on page 9).

Here is my basic implementation of Yaroslavskiy's algorithm (the dual-pivot scheme that is used in Java 7):

void sort(int[] A, int left, int right) {
    if (right > left) {
        // Choose outermost elements as pivots
        if (A[left] > A[right]) swap(A, left, right);
        int p = A[left], q = A[right];

        // Partition A according to invariant below
        int l = left + 1, g = right - 1, k = l;
        while (k = q) {
                while (A[g] > q && k < g) --g;
                swap(A, k, g);
                --g;
                if (A[k] < p) {
                    swap(A, k, l);
                    ++l;
                }
            }
            ++k;
        }
        --l; ++g;

        // Swap pivots to final place
        swap(A, left, l); swap(A, right, g);

        // Recursively sort partitions
        sort(A, left, l - 1);
        sort(A, l + 1, g - 1);
        sort(A, g + 1, right);
    }
}

void swap(int[] A, int i, int j) {
    final int tmp = A[i]; A[i] = A[j]; A[j] = tmp;
}


The partitioning loop maintains the following invariant:

(Initially, the “?”-area is the whole array, at the end it is gone; the indices are moved in the direction indicated by the arrays, so $\ell$ and $k$ start at the left, $g$ at the right end of $A$.)

I did a mathematical average case analysis for these algorithms to determine the expected number of key comparisons, element swaps and also instruction counts for some specific low-level implementations (in Java Bytecode and MMIX, an RISC assembly language).

The main result is that Yaroslavskiy's algorithm uses $1.9n \ln n + O(n)$ comparison, which is 5% less than the $2n \ln n + O(n)$ comparisons of classic single-pivot Quicksort (both average case on random permutations).
However, it needs $0.6 n\ln n + O(n)$ swaps for that, where classic Quicksort only needs $\frac13 n \ln n + O(n)$ swaps;
similarly, my instruction counts also indicate that classic Quicksort is more efficient.
What exactly makes Java 7's dual-pivot algorithm faster remains an open question.
(In case you are interested in more details on that, I could elaborate further.)

Tweaks of the algorithm like switching to Insertionsort for subproblems smaller than some (constant) threshold or optimizing code outside the partitioning loop (while(k <= g) { ... }) will only change the numbers hidden by $O(n)$.
This means that for large lists, they are not important.
For practical input sizes, they still help a lot, so having them in library code is a must.

Code Snippets

void sort(int[] A, int left, int right) {
    if (right > left) {
        // Choose outermost elements as pivots
        if (A[left] > A[right]) swap(A, left, right);
        int p = A[left], q = A[right];

        // Partition A according to invariant below
        int l = left + 1, g = right - 1, k = l;
        while (k <= g) {
            if (A[k] < p) {
                swap(A, k, l);
                ++l;
            } else if (A[k] >= q) {
                while (A[g] > q && k < g) --g;
                swap(A, k, g);
                --g;
                if (A[k] < p) {
                    swap(A, k, l);
                    ++l;
                }
            }
            ++k;
        }
        --l; ++g;

        // Swap pivots to final place
        swap(A, left, l); swap(A, right, g);

        // Recursively sort partitions
        sort(A, left, l - 1);
        sort(A, l + 1, g - 1);
        sort(A, g + 1, right);
    }
}

void swap(int[] A, int i, int j) {
    final int tmp = A[i]; A[i] = A[j]; A[j] = tmp;
}

Context

StackExchange Computer Science Q#24092, answer score: 11

Revisions (0)

No revisions yet.