patternjavaMinor
Is there anything to improve in this MergeSort code in Java?
Viewed 0 times
thismergesortanythingjavaimprovecodethere
Problem
I have written an implementation of the merge sort algorithm in Java.
Here is the
The code passed some simple tests. I wonder is there anything wrong or missing out? Anywhere to improve?
Here is the
MergeSort class that I created.public class MergeSort{
public static void sort(Comparable[] a){
Comparable[] b = new Comparable[a.length];
sort(a, b, 0, a.length-1);
}
private static void sort(Comparable[] a, Comparable[] b, int lo, int hi){
if(lo>=hi) return;
int mid = lo+(hi-lo)/2;
sort(a, b, lo, mid);
sort(a, b, mid+1, hi);
merge(a, b, lo, hi, mid);
}
private static void merge(Comparable[] a, Comparable[] b, int lo, int hi, int mid){
int i=lo;
int j=mid+1;
for(int k=lo;k0) {b[k]=a[j++];continue;}
else if(a[i].compareTo(a[j])hi && imid && j<=hi) {b[k]=a[j++];continue;}
}
for(int n=lo;n<=hi;n++){
a[n]=b[n];
}
}
}The code passed some simple tests. I wonder is there anything wrong or missing out? Anywhere to improve?
Solution
Generics
In Java, it is not safe to create an array of a generic type. This is what you are doing...
Because the generic types get removed when you compile the code (see Type Erasure), there is no safe way for Java to ensure that you have the same types of data in your array.
For example, you could have specified your input as:
Then, even though it is an array of Integer, the java compiler only knows it is an array of Comparable, so it can allow you to say
This leads to ugly warnings. The right solution is to use an appropriately typed collection
An alternate solution is to specify the full type of the Comparable as part of the method definition (a Generic Method... ):
... and then follow that same generic method in to your inner methods
It is complicated. There is some reference material here:
Style
You are not using white-space to help make things readable. Consider this line of code:
This should be:
See the java Code-Style guidelines for this:
All binary operators except . should be separated from their operands by spaces. Blank spaces should never separate unary operators such as unary minus, increment ("++"), and decrement ("--") from their operands
Additionally, you have a number of complex 1-liners that should be split on to multiple lines:
That should be:
Algorithm
MidPoint
You have also copied your code from some reference texts, or otherwise done some mid-point research:
This mechanism for getting the mid point of the array is the right pattern to use. Anyone who does it this way is either copying the code, has run in to the bug, or has been well-educated. This is a good thing, but, in case you did not know why you were doing it this way, now you know.
Bug
I believe you have a bug in the code when you have two values the same, one in each part of the merge....
... part of the reason this is hard to see is because your code is often doing more than one thing on each line:
If I space out that code block, there are a number of things in there that are apparent:
First up, for some
OK, back to the bug.... in the final else block, you have:
This block increments both the
There are two ways to solve this, but I prefer treating the equals values as if the one is larger than the other.... if you convert the one comparison check to be
it also means you only need one
ReWrite
I am not one of those people who thinks 'continue' is a bad thing. I like it for the job it can do, but your use of the
```
private static void merge(Comparable[] a, Comparable[] b, int lo, int hi,
int mid) {
int i = lo;
int j = mi
In Java, it is not safe to create an array of a generic type. This is what you are doing...
Comparable is really Comparable, and you are creating an array of Comparable[].Because the generic types get removed when you compile the code (see Type Erasure), there is no safe way for Java to ensure that you have the same types of data in your array.
For example, you could have specified your input as:
Integer[] data = new Integer[10];Then, even though it is an array of Integer, the java compiler only knows it is an array of Comparable, so it can allow you to say
data[0] = "broken";, but that will fail, at runtime, because you cannot put a String in to an array of Integer. because Java can only tell at runtime whether the data is valid input to a Generically typed array, it warns you whenever you use one.This leads to ugly warnings. The right solution is to use an appropriately typed collection
List data will be type-checkable at compile time, and java can 'do things right'.An alternate solution is to specify the full type of the Comparable as part of the method definition (a Generic Method... ):
public static > void sort(T[] a){
....
}... and then follow that same generic method in to your inner methods
It is complicated. There is some reference material here:
- Cannot create arrays of Parameterized Types
- Generic Methods
Style
You are not using white-space to help make things readable. Consider this line of code:
int mid = lo+(hi-lo)/2;This should be:
int mid = lo + (hi - lo) / 2;See the java Code-Style guidelines for this:
All binary operators except . should be separated from their operands by spaces. Blank spaces should never separate unary operators such as unary minus, increment ("++"), and decrement ("--") from their operands
Additionally, you have a number of complex 1-liners that should be split on to multiple lines:
if(i>mid && j<=hi) {b[k]=a[j++];continue;}That should be:
if(i > mid && j <= hi) {
b[k]=a[j++];
continue;
}Algorithm
MidPoint
You have also copied your code from some reference texts, or otherwise done some mid-point research:
int mid = lo+(hi-lo)/2This mechanism for getting the mid point of the array is the right pattern to use. Anyone who does it this way is either copying the code, has run in to the bug, or has been well-educated. This is a good thing, but, in case you did not know why you were doing it this way, now you know.
Bug
I believe you have a bug in the code when you have two values the same, one in each part of the merge....
... part of the reason this is hard to see is because your code is often doing more than one thing on each line:
if(a[i].compareTo(a[j])>0) {b[k]=a[j++];continue;}
else if(a[i].compareTo(a[j])<0) {b[k]=a[i++]; continue;}
else {
b[k]=a[i++];
j++;
continue;
}If I space out that code block, there are a number of things in there that are apparent:
if(a[i].compareTo(a[j]) > 0) {
b[k]=a[j++];
continue;
} else if(a[i].compareTo(a[j]) < 0) {
b[k]=a[i++];
continue;
} else {
b[k]=a[i++];
j++;
continue;
}First up, for some
Comparable classes the compareTo method can be expensive. and, you will be doing it twice for many checks. You should do it once, and save away the result:int comparison = a[i].compareTo(a[j])
if (comparison > 0) {
...
} else if (comparison < 0) {
...
} else {
...
}OK, back to the bug.... in the final else block, you have:
} else {
b[k]=a[i++];
j++;
continue;
}This block increments both the
i and the j pointers, but it only adds the i value to the merged set, which means the equal value at j is 'lost'.There are two ways to solve this, but I prefer treating the equals values as if the one is larger than the other.... if you convert the one comparison check to be
>= 0 instead of just > 0, you will successfully merge results.... then you can get rid of the whole last block entirely....it also means you only need one
compareTo check:if(a[i].compareTo(a[j]) >= 0) {
b[k]=a[j++];
continue;
} else {
b[k]=a[i++];
continue;
}ReWrite
I am not one of those people who thinks 'continue' is a bad thing. I like it for the job it can do, but your use of the
continue is excessive.... Every path through the loop is a 'continue... this is something easily solved with a little planning. Consider this rewrite:```
private static void merge(Comparable[] a, Comparable[] b, int lo, int hi,
int mid) {
int i = lo;
int j = mi
Code Snippets
Integer[] data = new Integer[10];public static <T extends Comparable<?>> void sort(T[] a){
....
}int mid = lo+(hi-lo)/2;int mid = lo + (hi - lo) / 2;if(i>mid && j<=hi) {b[k]=a[j++];continue;}Context
StackExchange Code Review Q#42590, answer score: 3
Revisions (0)
No revisions yet.