debugjavaMinor
Parsing error messages using regular expressions in Java 8
Viewed 0 times
errorregularexpressionsjavaparsingusingmessages
Problem
I have a below method in Java 8 which checks for message pattern and finds the line number from the message.
Example : message can be of below types
The above method finds the line number from the message and returns the instance of invoice from the arraylist.
Since Java 8 has some new features am wondering whether my code can be optimized further.
private LawfirmInvoice getInvoiceForErrormessage(InvoiceParcel parcel, String message) {
if (parcel.getInvoices() != null) {
if (parcel.getInvoices().size() == 1)
return parcel.getInvoices().get(0);
else {
String regex1 = "^Line\\s+\\:\\s+\\d.+";
String regex2 = "^Line\\s\\d\\s\\:\\s+.+";
Pattern p = Pattern.compile("\\d+");
Matcher m;
int lineNumber = -1;
if (message.matches(regex1)) {
m = p.matcher(message);
if (m.find()) {
lineNumber = Integer.parseInt(m.group());
} else if (message.matches(regex2)) {
m = p.matcher(message);
if (m.find()) {
lineNumber = Integer.parseInt(m.group()) - 2;
}
}
}
if (lineNumber != -1)
return parcel.getInvoices().get(lineNumber);
}
}
return null;
}Example : message can be of below types
msg1 = "Line : 1 Invoice does not foot Reported"msg2 = "Line 3 : Could not parse ADJUSTMENT_AMOUNT value"The above method finds the line number from the message and returns the instance of invoice from the arraylist.
Since Java 8 has some new features am wondering whether my code can be optimized further.
Solution
First, let's optimize what you already have. Extract the regexes, and simplify the logic:
OK, the regex is more complicated, but, there's only one of them, and it is compiled only once. The regex looks for digits on either side of the colon, and returns them in either group 1, or group 2. Note that the regex expression
Also, note that your original regex probably has a bug, because it searches for just a single-digit in it:
Additionally, there is no need to escape the
Now, there's only 1 regex, not three. It is statically compiled, and our code logic is streamlined.
What we now have, is a function, it takes two arguments, and returns a third. I cannot see any references to 'state' in your class, so your method can easily be a static method. This makes your function a proper
This is where you can start having 'fun' with java 8 streams and so on:
(Assuming a
private static final String LINE_EX = "^Line\\s(?:(\\d+)\\s)?\\s*:\\s+(\\d+)?.+"
private static final Pattern LINE_PARSE = Pattern.compile(LINE_EX);
private LawfirmInvoice getInvoiceForErrormessage(InvoiceParcel parcel, String message) {
if (parcel.getInvoices() == null) {
return null;
}
if (parcel.getInvoices().size() == 1) {
return parcel.getInvoices().get(0);
}
Matcher mat = LINE_PARSER.matcher(message);
if (!mat.matches()) {
return null;
}
String group = mat.group(1) != null ? mat.group(1) : mat.group(2);
if (group == null) {
// neither regex group had digits....
return null;
}
int lineNumber = Integer.parseInt(group);
return parcel.getInvoices().get(lineNumber);
}OK, the regex is more complicated, but, there's only one of them, and it is compiled only once. The regex looks for digits on either side of the colon, and returns them in either group 1, or group 2. Note that the regex expression
(?: .... ) is a non-grouping match, so does not count as a group, but the (\\d+) inside it, does.Also, note that your original regex probably has a bug, because it searches for just a single-digit in it:
String regex2 = "^Line\\s\\d\\s\\:\\s+.+"; when it should probably look for \\d+ and not just \\d.Additionally, there is no need to escape the
: in your regexes.Now, there's only 1 regex, not three. It is statically compiled, and our code logic is streamlined.
What we now have, is a function, it takes two arguments, and returns a third. I cannot see any references to 'state' in your class, so your method can easily be a static method. This makes your function a proper
java.util.function.BiFunction function if you want it to be, e.g.:BiFunction GET_ERROR_INVOICE =
MyClass::getInvoiceForErrormessage;This is where you can start having 'fun' with java 8 streams and so on:
final List parcels = ......
// create an array of all LawFirmInvoices that have errors...
parcels.stream()
.flatMap(parcel -> getMessagesFor(parcel)
.stream(message -> getInvoiceForErrormessage(parcel, message))
.filter(invoice -> invoice != null)
.toArray(size -> new LawFirmInvoice[size]);(Assuming a
getMessagesFor(parcel) exists....)Code Snippets
private static final String LINE_EX = "^Line\\s(?:(\\d+)\\s)?\\s*:\\s+(\\d+)?.+"
private static final Pattern LINE_PARSE = Pattern.compile(LINE_EX);
private LawfirmInvoice getInvoiceForErrormessage(InvoiceParcel parcel, String message) {
if (parcel.getInvoices() == null) {
return null;
}
if (parcel.getInvoices().size() == 1) {
return parcel.getInvoices().get(0);
}
Matcher mat = LINE_PARSER.matcher(message);
if (!mat.matches()) {
return null;
}
String group = mat.group(1) != null ? mat.group(1) : mat.group(2);
if (group == null) {
// neither regex group had digits....
return null;
}
int lineNumber = Integer.parseInt(group);
return parcel.getInvoices().get(lineNumber);
}BiFunction<InvoiceParcel, String, LawfirmInvoice> GET_ERROR_INVOICE =
MyClass::getInvoiceForErrormessage;final List<InvoiceParcel> parcels = ......
// create an array of all LawFirmInvoices that have errors...
parcels.stream()
.flatMap(parcel -> getMessagesFor(parcel)
.stream(message -> getInvoiceForErrormessage(parcel, message))
.filter(invoice -> invoice != null)
.toArray(size -> new LawFirmInvoice[size]);Context
StackExchange Code Review Q#84986, answer score: 5
Revisions (0)
No revisions yet.