snippetjavaMinor
Heap selection sort in Java
Viewed 0 times
javasortselectionheap
Problem
See the next iteration.
I designed and implemented this adaptive heap sort in Java. It works as follows:
Each run \$r\$ is encoded by two integers: \$r.from\$ is the index of the first array component in the run, and \$r.to\$ is the index of the last array component in the run. In the run heap, the top element is the run \$r\$, for which \$A[r.from]\$ is minimal (notice the indirection).
The best case complexity of this sort is \$\Theta(N)\$. The worst case complexity is \$\Theta(N \log N)\$. Linear space complexity and is unstable (may reorder the equal elements) due to heap structure.
HeapSelectionSort.java:
```
package net.coderodde.util.sorting;
import java.util.Arrays;
import java.util.Comparator;
/**
* This class implements a sorting algorithm called
* heap selection sort. The worst case complexity is linearithmic
* O(n log n), best case complexity linear O(n). Linear space complexity and is
* unstable.
*
* @author Rodion "rodde" Efremov
* @version 1.6
*/
public class HeapSelectionSort {
/**
* Sorts the range {@code array[fromIndex], array[fromIndex + 1], ...,
* array[toIndex - 2], array[toIndex - 1]} using the specified comparator.
*
* @param the array component type.
* @param array the array holding the range to sort.
* @param fromIndex the starting (inclusive) in
I designed and implemented this adaptive heap sort in Java. It works as follows:
- Copy the range to be sorted into auxiliary array \$A\$
- Create an empty run heap \$H\$
- Scan \$A\$ from beginning to the end and whenever we have counted a longest possible run (a run is a longest ascending or descending contiguous subsequence of the input sequence), we store it in \$H\$.
Each run \$r\$ is encoded by two integers: \$r.from\$ is the index of the first array component in the run, and \$r.to\$ is the index of the last array component in the run. In the run heap, the top element is the run \$r\$, for which \$A[r.from]\$ is minimal (notice the indirection).
- Then, as long as the run heap is not empty, we remove the top element. This is done by returning \$A[r.from]\$, where \$r\$ is the topmost run in the run heap. After that, we set \$r.from \leftarrow r.from + 1\$, and possibly sift it down in order to restore the indirect heap invariant. If a run is exhausted, it is removed from the heap.
The best case complexity of this sort is \$\Theta(N)\$. The worst case complexity is \$\Theta(N \log N)\$. Linear space complexity and is unstable (may reorder the equal elements) due to heap structure.
HeapSelectionSort.java:
```
package net.coderodde.util.sorting;
import java.util.Arrays;
import java.util.Comparator;
/**
* This class implements a sorting algorithm called
* heap selection sort. The worst case complexity is linearithmic
* O(n log n), best case complexity linear O(n). Linear space complexity and is
* unstable.
*
* @author Rodion "rodde" Efremov
* @version 1.6
*/
public class HeapSelectionSort {
/**
* Sorts the range {@code array[fromIndex], array[fromIndex + 1], ...,
* array[toIndex - 2], array[toIndex - 1]} using the specified comparator.
*
* @param the array component type.
* @param array the array holding the range to sort.
* @param fromIndex the starting (inclusive) in
Solution
This algorithm is new to me (which doesn't mean much) and it looks interesting. But you need do some comparisons in order to get famous (:D). How is the performance when compared to the stable
HeapSelectionSort
createHeap
The first confusing bit was for me
I guess,
private static class Heap {
I'd call it
siftDown
I'd define it inside the loop like
and save one line.
array[runs[leftChildIndex].from]
This gets repeated a lot and could use a method like
Demo
You should repeat your measurements a few times in order to see how well it runs when fully compiled.
You could also create a
I can see, you're doing it since you want the
Here, it doesn't matter but in general using the millisecond for seeding is far from optimum.
I hope, they're the same. So don't print it, verify it
and save yourself something to check manually.
Analysis
I haven't tried it out, but I can see that you're allocation a lot of memory. A copy of the whole to be sorted range and an additional big array for the heap.
In the best case (sorted array), it obviously wastes time, but I'd agree that this case is not worth optimizing. But cases of nearly-sorted arrays may be rather common and you could try to optimize for them. You could use
Another thing is the creation of many tiny
Decreasing the number of runs
If you encounter two consecutive very short runs, merge them together. Depending on how exactly you do it, you may ensure that there are no more than
Eliminating the class
This is very low level, but
Measurements
This is the output when run 10 times in row:
```
Seed: 6578457428755244339
--- Random integer array ---
java.util.Arrays.sort in 359 ms.
Heap selection sort in 771 ms.
--- Presorted array ---
java.util.Arrays.sort in 57 ms. Sorted: true
Heap selection sort in 91 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 142 ms.
Heap selection sort in 675 ms.
--- Presorted array ---
java.util.Arrays.sort in 54 ms. Sorted: true
Heap selection sort in 65 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 127 ms.
Heap selection sort in 681 ms.
--- Presorted array ---
java.util.Arrays.sort in 60 ms. Sorted: true
Heap selection sort in 67 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 128 ms.
Heap selection sort in 688 ms.
--- Presorted array ---
java.util.Arrays.sort in 54 ms. Sorted: true
Heap selection sort in 65 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 130 ms.
Heap selection sort in 698 ms.
--- Presorted array ---
java.util.Arrays.sort in 54 ms. Sorted: true
Heap selection sort in 86 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 125 ms.
Heap selection sort in 716 ms.
--- Presorted array ---
java.util.Arrays.sort in 54 ms. Sorted: true
Heap selection sort in 70 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 129 ms.
Heap selection sort in 717 ms.
--- Presorted array ---
java.util.Arrays.sort in 56 ms. Sorted: true
Heap selection sort in 64 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 132 ms.
Heap selection sort in 708 ms.
--- Presorted array ---
java.util.Arrays.sort in 60 ms. Sorted: true
Heap selection sort in 64 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 123 ms.
Heap selection sort in 698 ms.
--- Presorted array ---
java.util.Arrays.sort in 53 ms. Sorted: true
Heap selection sort in 6
Collections.sort? I can see that your Demo measures it, but for us, lazy people, you should present the results.HeapSelectionSort
createHeap
The first confusing bit was for me
int left = 0;
int right = 1;I guess,
left + 1 == right is an invariant and eliminating one of them could make it more readable. This should be easily possible.private static class Heap {
I'd call it
RunHeap as Heap has a well-known meaning.siftDown
int rightChildIndex = 2;
...
rightChildIndex = leftChildIndex + 1;I'd define it inside the loop like
int rightChildIndex = leftChildIndex + 1;and save one line.
array[runs[leftChildIndex].from]
This gets repeated a lot and could use a method like
T getFromObject(int index) {
return array[runs[index].from]
}Demo
You should repeat your measurements a few times in order to see how well it runs when fully compiled.
You could also create a
Comparator counting how many times it's been called. This is hacky, but could give you valuable information about how you're doing.long seed = System.currentTimeMillis();
Random random = new Random(seed);I can see, you're doing it since you want the
seed, but I'd use the better build-in randomization likefinal Random random = new Random();
final long seed = random.nextLong();
random.setSeed(seed);Here, it doesn't matter but in general using the millisecond for seeding is far from optimum.
System.out.println("Array contents same: " + arraysEqual(array1, array2));I hope, they're the same. So don't print it, verify it
if (!arraysEqual(array1, array2)) {
throw new RuntimeException("Array contents NOT same");
}and save yourself something to check manually.
Analysis
I haven't tried it out, but I can see that you're allocation a lot of memory. A copy of the whole to be sorted range and an additional big array for the heap.
In the best case (sorted array), it obviously wastes time, but I'd agree that this case is not worth optimizing. But cases of nearly-sorted arrays may be rather common and you could try to optimize for them. You could use
ArrayList instead of the array (this is obviously not free).Another thing is the creation of many tiny
Runs, which cost a lot of memory and lead to bad memory locality. I've got two ideas:Decreasing the number of runs
If you encounter two consecutive very short runs, merge them together. Depending on how exactly you do it, you may ensure that there are no more than
array.length / 4 + 3 runs or alike.Eliminating the class
This is very low level, but
int + int = long. Put from into the lower part of the long and to into the upper part. Hardly anything will change, your sifting will stay basically the same and you'll save maybe 20 bytes per run and improve your memory locality.Measurements
This is the output when run 10 times in row:
```
Seed: 6578457428755244339
--- Random integer array ---
java.util.Arrays.sort in 359 ms.
Heap selection sort in 771 ms.
--- Presorted array ---
java.util.Arrays.sort in 57 ms. Sorted: true
Heap selection sort in 91 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 142 ms.
Heap selection sort in 675 ms.
--- Presorted array ---
java.util.Arrays.sort in 54 ms. Sorted: true
Heap selection sort in 65 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 127 ms.
Heap selection sort in 681 ms.
--- Presorted array ---
java.util.Arrays.sort in 60 ms. Sorted: true
Heap selection sort in 67 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 128 ms.
Heap selection sort in 688 ms.
--- Presorted array ---
java.util.Arrays.sort in 54 ms. Sorted: true
Heap selection sort in 65 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 130 ms.
Heap selection sort in 698 ms.
--- Presorted array ---
java.util.Arrays.sort in 54 ms. Sorted: true
Heap selection sort in 86 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 125 ms.
Heap selection sort in 716 ms.
--- Presorted array ---
java.util.Arrays.sort in 54 ms. Sorted: true
Heap selection sort in 70 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 129 ms.
Heap selection sort in 717 ms.
--- Presorted array ---
java.util.Arrays.sort in 56 ms. Sorted: true
Heap selection sort in 64 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 132 ms.
Heap selection sort in 708 ms.
--- Presorted array ---
java.util.Arrays.sort in 60 ms. Sorted: true
Heap selection sort in 64 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 123 ms.
Heap selection sort in 698 ms.
--- Presorted array ---
java.util.Arrays.sort in 53 ms. Sorted: true
Heap selection sort in 6
Code Snippets
int left = 0;
int right = 1;int rightChildIndex = 2;
...
rightChildIndex = leftChildIndex + 1;int rightChildIndex = leftChildIndex + 1;T getFromObject(int index) {
return array[runs[index].from]
}long seed = System.currentTimeMillis();
Random random = new Random(seed);Context
StackExchange Code Review Q#97958, answer score: 3
Revisions (0)
No revisions yet.