HiveBrain v1.2.0
Get Started
← Back to all entries
snippetjavaMinor

Heap selection sort in Java

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
javasortselectionheap

Problem

See the next iteration.

I designed and implemented this adaptive heap sort in Java. It works as follows:

  • Copy the range to be sorted into auxiliary array \$A\$



  • Create an empty run heap \$H\$



  • Scan \$A\$ from beginning to the end and whenever we have counted a longest possible run (a run is a longest ascending or descending contiguous subsequence of the input sequence), we store it in \$H\$.


Each run \$r\$ is encoded by two integers: \$r.from\$ is the index of the first array component in the run, and \$r.to\$ is the index of the last array component in the run. In the run heap, the top element is the run \$r\$, for which \$A[r.from]\$ is minimal (notice the indirection).

  • Then, as long as the run heap is not empty, we remove the top element. This is done by returning \$A[r.from]\$, where \$r\$ is the topmost run in the run heap. After that, we set \$r.from \leftarrow r.from + 1\$, and possibly sift it down in order to restore the indirect heap invariant. If a run is exhausted, it is removed from the heap.



The best case complexity of this sort is \$\Theta(N)\$. The worst case complexity is \$\Theta(N \log N)\$. Linear space complexity and is unstable (may reorder the equal elements) due to heap structure.

HeapSelectionSort.java:

```
package net.coderodde.util.sorting;

import java.util.Arrays;
import java.util.Comparator;

/**
* This class implements a sorting algorithm called
* heap selection sort. The worst case complexity is linearithmic
* O(n log n), best case complexity linear O(n). Linear space complexity and is
* unstable.
*
* @author Rodion "rodde" Efremov
* @version 1.6
*/
public class HeapSelectionSort {

/**
* Sorts the range {@code array[fromIndex], array[fromIndex + 1], ...,
* array[toIndex - 2], array[toIndex - 1]} using the specified comparator.
*
* @param the array component type.
* @param array the array holding the range to sort.
* @param fromIndex the starting (inclusive) in

Solution

This algorithm is new to me (which doesn't mean much) and it looks interesting. But you need do some comparisons in order to get famous (:D). How is the performance when compared to the stable Collections.sort? I can see that your Demo measures it, but for us, lazy people, you should present the results.

HeapSelectionSort

createHeap

The first confusing bit was for me

int left = 0;
   int right = 1;


I guess, left + 1 == right is an invariant and eliminating one of them could make it more readable. This should be easily possible.


private static class Heap {

I'd call it RunHeap as Heap has a well-known meaning.

siftDown

int rightChildIndex = 2;
       ...
           rightChildIndex = leftChildIndex + 1;


I'd define it inside the loop like

int rightChildIndex = leftChildIndex + 1;


and save one line.


array[runs[leftChildIndex].from]

This gets repeated a lot and could use a method like

T getFromObject(int index) {
    return array[runs[index].from]
}


Demo

You should repeat your measurements a few times in order to see how well it runs when fully compiled.

You could also create a Comparator counting how many times it's been called. This is hacky, but could give you valuable information about how you're doing.

long seed = System.currentTimeMillis();
   Random random = new Random(seed);


I can see, you're doing it since you want the seed, but I'd use the better build-in randomization like

final Random random = new Random();
    final long seed = random.nextLong();
    random.setSeed(seed);


Here, it doesn't matter but in general using the millisecond for seeding is far from optimum.

System.out.println("Array contents same: " + arraysEqual(array1, array2));


I hope, they're the same. So don't print it, verify it

if (!arraysEqual(array1, array2)) {
     throw new RuntimeException("Array contents NOT same");
 }


and save yourself something to check manually.

Analysis

I haven't tried it out, but I can see that you're allocation a lot of memory. A copy of the whole to be sorted range and an additional big array for the heap.

In the best case (sorted array), it obviously wastes time, but I'd agree that this case is not worth optimizing. But cases of nearly-sorted arrays may be rather common and you could try to optimize for them. You could use ArrayList instead of the array (this is obviously not free).

Another thing is the creation of many tiny Runs, which cost a lot of memory and lead to bad memory locality. I've got two ideas:

Decreasing the number of runs

If you encounter two consecutive very short runs, merge them together. Depending on how exactly you do it, you may ensure that there are no more than array.length / 4 + 3 runs or alike.

Eliminating the class

This is very low level, but int + int = long. Put from into the lower part of the long and to into the upper part. Hardly anything will change, your sifting will stay basically the same and you'll save maybe 20 bytes per run and improve your memory locality.

Measurements

This is the output when run 10 times in row:

```
Seed: 6578457428755244339
--- Random integer array ---
java.util.Arrays.sort in 359 ms.
Heap selection sort in 771 ms.
--- Presorted array ---
java.util.Arrays.sort in 57 ms. Sorted: true
Heap selection sort in 91 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 142 ms.
Heap selection sort in 675 ms.
--- Presorted array ---
java.util.Arrays.sort in 54 ms. Sorted: true
Heap selection sort in 65 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 127 ms.
Heap selection sort in 681 ms.
--- Presorted array ---
java.util.Arrays.sort in 60 ms. Sorted: true
Heap selection sort in 67 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 128 ms.
Heap selection sort in 688 ms.
--- Presorted array ---
java.util.Arrays.sort in 54 ms. Sorted: true
Heap selection sort in 65 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 130 ms.
Heap selection sort in 698 ms.
--- Presorted array ---
java.util.Arrays.sort in 54 ms. Sorted: true
Heap selection sort in 86 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 125 ms.
Heap selection sort in 716 ms.
--- Presorted array ---
java.util.Arrays.sort in 54 ms. Sorted: true
Heap selection sort in 70 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 129 ms.
Heap selection sort in 717 ms.
--- Presorted array ---
java.util.Arrays.sort in 56 ms. Sorted: true
Heap selection sort in 64 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 132 ms.
Heap selection sort in 708 ms.
--- Presorted array ---
java.util.Arrays.sort in 60 ms. Sorted: true
Heap selection sort in 64 ms. Sorted: true
--- Random integer array ---
java.util.Arrays.sort in 123 ms.
Heap selection sort in 698 ms.
--- Presorted array ---
java.util.Arrays.sort in 53 ms. Sorted: true
Heap selection sort in 6

Code Snippets

int left = 0;
   int right = 1;
int rightChildIndex = 2;
       ...
           rightChildIndex = leftChildIndex + 1;
int rightChildIndex = leftChildIndex + 1;
T getFromObject(int index) {
    return array[runs[index].from]
}
long seed = System.currentTimeMillis();
   Random random = new Random(seed);

Context

StackExchange Code Review Q#97958, answer score: 3

Revisions (0)

No revisions yet.