patternjavaMinor
Number aware string sorting with comparator
Viewed 0 times
sortingnumberwithstringcomparatoraware
Problem
I have this class for use in sorting strings such that if strings have a number in the same position it will order the numbers in increasing order.
Alphabetical gives:
What I'm calling "number aware" string sorting should give:
Here is what I have using a regex split from there.
The code seems to work. Any cases where I could run into problem? If not any suggestions on making it simpler or more efficient.
Alphabetical gives:
- file1
- file10
- file2
What I'm calling "number aware" string sorting should give:
- file1
- file2
- file10
Here is what I have using a regex split from there.
The code seems to work. Any cases where I could run into problem? If not any suggestions on making it simpler or more efficient.
import java.util.Comparator;
public class NumberAwareStringComparator implements Comparator{
public int compare(String s1, String s2) {
String[] s1Parts = s1.split("(? s2.length()){
return 1;
}else{
return 0;
}
}
}Solution
Exceptions should be reserved for exceptional situations, and should be avoided if possible. The fundamental reason you have to deal with
Here's a strategy that always compares non-digits to non-digits, and numbers to numbers.
Since strings of digits (such as those representing dates, like
Also, since the code works just as well with
NumberFormatException is that after splitting, you don't know whether each part contains a number or a non-number.Here's a strategy that always compares non-digits to non-digits, and numbers to numbers.
import java.math.BigInteger;
import java.util.Comparator;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class NumberAwareStringComparator implements Comparator {
public static final NumberAwareStringComparator INSTANCE =
new NumberAwareStringComparator();
private static final Pattern PATTERN = Pattern.compile("(\\D*)(\\d*)");
private NumberAwareStringComparator() {
}
public int compare(CharSequence s1, CharSequence s2) {
Matcher m1 = PATTERN.matcher(s1);
Matcher m2 = PATTERN.matcher(s2);
// The only way find() could fail is at the end of a string
while (m1.find() && m2.find()) {
// matcher.group(1) fetches any non-digits captured by the
// first parentheses in PATTERN.
int nonDigitCompare = m1.group(1).compareTo(m2.group(1));
if (0 != nonDigitCompare) {
return nonDigitCompare;
}
// matcher.group(2) fetches any digits captured by the
// second parentheses in PATTERN.
if (m1.group(2).isEmpty()) {
return m2.group(2).isEmpty() ? 0 : -1;
} else if (m2.group(2).isEmpty()) {
return +1;
}
BigInteger n1 = new BigInteger(m1.group(2));
BigInteger n2 = new BigInteger(m2.group(2));
int numberCompare = n1.compareTo(n2);
if (0 != numberCompare) {
return numberCompare;
}
}
// Handle if one string is a prefix of the other.
// Nothing comes before something.
return m1.hitEnd() && m2.hitEnd() ? 0 :
m1.hitEnd() ? -1 : +1;
}
}Since strings of digits (such as those representing dates, like
20131212123456.log) can overflow an int, I've used java.math.BigInteger.Also, since the code works just as well with
CharSequence as with String, I've generalized the type to Comparator.Code Snippets
import java.math.BigInteger;
import java.util.Comparator;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class NumberAwareStringComparator implements Comparator<CharSequence> {
public static final NumberAwareStringComparator INSTANCE =
new NumberAwareStringComparator();
private static final Pattern PATTERN = Pattern.compile("(\\D*)(\\d*)");
private NumberAwareStringComparator() {
}
public int compare(CharSequence s1, CharSequence s2) {
Matcher m1 = PATTERN.matcher(s1);
Matcher m2 = PATTERN.matcher(s2);
// The only way find() could fail is at the end of a string
while (m1.find() && m2.find()) {
// matcher.group(1) fetches any non-digits captured by the
// first parentheses in PATTERN.
int nonDigitCompare = m1.group(1).compareTo(m2.group(1));
if (0 != nonDigitCompare) {
return nonDigitCompare;
}
// matcher.group(2) fetches any digits captured by the
// second parentheses in PATTERN.
if (m1.group(2).isEmpty()) {
return m2.group(2).isEmpty() ? 0 : -1;
} else if (m2.group(2).isEmpty()) {
return +1;
}
BigInteger n1 = new BigInteger(m1.group(2));
BigInteger n2 = new BigInteger(m2.group(2));
int numberCompare = n1.compareTo(n2);
if (0 != numberCompare) {
return numberCompare;
}
}
// Handle if one string is a prefix of the other.
// Nothing comes before something.
return m1.hitEnd() && m2.hitEnd() ? 0 :
m1.hitEnd() ? -1 : +1;
}
}Context
StackExchange Code Review Q#37192, answer score: 8
Revisions (0)
No revisions yet.