patternModerate
Dual-pivot Quicksort reference implementation?
Viewed 0 times
referencepivotquicksortdualimplementation
Problem
Has some sort of canonical - or reference - implementation of Dual-pivot Quicksort been posted anywhere?
I would like to include that algorithm in a comparison among sorting algorithms for a specialized need that I have, but the Java versions I've seen appear to have various kinds of tweaks applied to them, like using Insertion Sort for small (sub-) arrays, which makes it harder to compare the fundamentals.
I would like to include that algorithm in a comparison among sorting algorithms for a specialized need that I have, but the Java versions I've seen appear to have various kinds of tweaks applied to them, like using Insertion Sort for small (sub-) arrays, which makes it harder to compare the fundamentals.
Solution
I tried to do exactly such a comparison in my master thesis, which thus naturally includes pseudo-code of “basic” versions of several dual-pivot Quicksorts (there is a list of them on page 9).
Here is my basic implementation of Yaroslavskiy's algorithm (the dual-pivot scheme that is used in Java 7):
The partitioning loop maintains the following invariant:
(Initially, the “?”-area is the whole array, at the end it is gone; the indices are moved in the direction indicated by the arrays, so $\ell$ and $k$ start at the left, $g$ at the right end of $A$.)
I did a mathematical average case analysis for these algorithms to determine the expected number of key comparisons, element swaps and also instruction counts for some specific low-level implementations (in Java Bytecode and MMIX, an RISC assembly language).
The main result is that Yaroslavskiy's algorithm uses $1.9n \ln n + O(n)$ comparison, which is 5% less than the $2n \ln n + O(n)$ comparisons of classic single-pivot Quicksort (both average case on random permutations).
However, it needs $0.6 n\ln n + O(n)$ swaps for that, where classic Quicksort only needs $\frac13 n \ln n + O(n)$ swaps;
similarly, my instruction counts also indicate that classic Quicksort is more efficient.
What exactly makes Java 7's dual-pivot algorithm faster remains an open question.
(In case you are interested in more details on that, I could elaborate further.)
Tweaks of the algorithm like switching to Insertionsort for subproblems smaller than some (constant) threshold or optimizing code outside the partitioning loop (
This means that for large lists, they are not important.
For practical input sizes, they still help a lot, so having them in library code is a must.
Here is my basic implementation of Yaroslavskiy's algorithm (the dual-pivot scheme that is used in Java 7):
void sort(int[] A, int left, int right) {
if (right > left) {
// Choose outermost elements as pivots
if (A[left] > A[right]) swap(A, left, right);
int p = A[left], q = A[right];
// Partition A according to invariant below
int l = left + 1, g = right - 1, k = l;
while (k = q) {
while (A[g] > q && k < g) --g;
swap(A, k, g);
--g;
if (A[k] < p) {
swap(A, k, l);
++l;
}
}
++k;
}
--l; ++g;
// Swap pivots to final place
swap(A, left, l); swap(A, right, g);
// Recursively sort partitions
sort(A, left, l - 1);
sort(A, l + 1, g - 1);
sort(A, g + 1, right);
}
}
void swap(int[] A, int i, int j) {
final int tmp = A[i]; A[i] = A[j]; A[j] = tmp;
}The partitioning loop maintains the following invariant:
(Initially, the “?”-area is the whole array, at the end it is gone; the indices are moved in the direction indicated by the arrays, so $\ell$ and $k$ start at the left, $g$ at the right end of $A$.)
I did a mathematical average case analysis for these algorithms to determine the expected number of key comparisons, element swaps and also instruction counts for some specific low-level implementations (in Java Bytecode and MMIX, an RISC assembly language).
The main result is that Yaroslavskiy's algorithm uses $1.9n \ln n + O(n)$ comparison, which is 5% less than the $2n \ln n + O(n)$ comparisons of classic single-pivot Quicksort (both average case on random permutations).
However, it needs $0.6 n\ln n + O(n)$ swaps for that, where classic Quicksort only needs $\frac13 n \ln n + O(n)$ swaps;
similarly, my instruction counts also indicate that classic Quicksort is more efficient.
What exactly makes Java 7's dual-pivot algorithm faster remains an open question.
(In case you are interested in more details on that, I could elaborate further.)
Tweaks of the algorithm like switching to Insertionsort for subproblems smaller than some (constant) threshold or optimizing code outside the partitioning loop (
while(k <= g) { ... }) will only change the numbers hidden by $O(n)$. This means that for large lists, they are not important.
For practical input sizes, they still help a lot, so having them in library code is a must.
Code Snippets
void sort(int[] A, int left, int right) {
if (right > left) {
// Choose outermost elements as pivots
if (A[left] > A[right]) swap(A, left, right);
int p = A[left], q = A[right];
// Partition A according to invariant below
int l = left + 1, g = right - 1, k = l;
while (k <= g) {
if (A[k] < p) {
swap(A, k, l);
++l;
} else if (A[k] >= q) {
while (A[g] > q && k < g) --g;
swap(A, k, g);
--g;
if (A[k] < p) {
swap(A, k, l);
++l;
}
}
++k;
}
--l; ++g;
// Swap pivots to final place
swap(A, left, l); swap(A, right, g);
// Recursively sort partitions
sort(A, left, l - 1);
sort(A, l + 1, g - 1);
sort(A, g + 1, right);
}
}
void swap(int[] A, int i, int j) {
final int tmp = A[i]; A[i] = A[j]; A[j] = tmp;
}Context
StackExchange Computer Science Q#24092, answer score: 11
Revisions (0)
No revisions yet.