patternjavaMinor
Fast regex to extract strings before and after a time
Viewed 0 times
fastafterbeforetimeextractandregexstrings
Problem
I want to get
Of the two working regexes below, which is faster?
Regex 1
Regex 2
This is the code I've used:
text1 and text2 by splitting time formatText1 10:24:02 Text2
Of the two working regexes below, which is faster?
Regex 1
String regex1= "([0-9]{2}):([0-9]{2}):([0-9]{2})";
Regex 2
String regex2= "[0-9][0-9]:[0-9][0-9]:[0-9][0-9]";
This is the code I've used:
String s="Text1 10:24:02 Text2";
String[] split= s.split(regex1);//regex 1 and 2
System.out.println(split[0]);
System.out.println(split[1]);
Solution
Regex 2 is faster, but probably not for the reason that you expect.
You can easily answer questions like these by writing a benchmark. Here is an example:
The JIT compiler tends to do funny tricks. For fairness, I've warmed up the loop by executing both of them without timing. I've also interleaved the calls to
I found that Regex 1 is slower than Regex 2 by about 5%.
However, Regex 1 has some capturing parentheses. If you remove them,
then you get a result that is 16% faster than Regex 2.
You can easily answer questions like these by writing a benchmark. Here is an example:
public static long splitTime(String regex, String text) {
long start = System.currentTimeMillis();
for (int i = 0; i < 1000; i++) {
String[] split = text.split(regex);
}
long end = System.currentTimeMillis();
return end - start;
}
public static void main(String[] args) {
String regex1 = "([0-9]{2}):([0-9]{2}):([0-9]{2})";
String regex2 = "[0-9][0-9]:[0-9][0-9]:[0-9][0-9]";
String s="Text1 10:24:02 Text2";
// Warm up the loops
for (int i = 0; i < 2000; i++) {
splitTime(regex1, s);
splitTime(regex2, s);
}
long time0 = 0, time1 = 0, time2 = 0;
for (int i = 0; i < 2000; i++) {
time1 += splitTime(regex1, s);
time2 += splitTime(regex2, s);
time2 += splitTime(regex2, s);
time1 += splitTime(regex1, s);
}
System.out.println("Regex 1: " + time1);
System.out.println("Regex 2: " + time2);
}The JIT compiler tends to do funny tricks. For fairness, I've warmed up the loop by executing both of them without timing. I've also interleaved the calls to
splitTime in case the order somehow makes a difference.I found that Regex 1 is slower than Regex 2 by about 5%.
However, Regex 1 has some capturing parentheses. If you remove them,
String regex0 = "[0-9]{2}:[0-9]{2}:[0-9]{2}";then you get a result that is 16% faster than Regex 2.
Code Snippets
public static long splitTime(String regex, String text) {
long start = System.currentTimeMillis();
for (int i = 0; i < 1000; i++) {
String[] split = text.split(regex);
}
long end = System.currentTimeMillis();
return end - start;
}
public static void main(String[] args) {
String regex1 = "([0-9]{2}):([0-9]{2}):([0-9]{2})";
String regex2 = "[0-9][0-9]:[0-9][0-9]:[0-9][0-9]";
String s="Text1 10:24:02 Text2";
// Warm up the loops
for (int i = 0; i < 2000; i++) {
splitTime(regex1, s);
splitTime(regex2, s);
}
long time0 = 0, time1 = 0, time2 = 0;
for (int i = 0; i < 2000; i++) {
time1 += splitTime(regex1, s);
time2 += splitTime(regex2, s);
time2 += splitTime(regex2, s);
time1 += splitTime(regex1, s);
}
System.out.println("Regex 1: " + time1);
System.out.println("Regex 2: " + time2);
}String regex0 = "[0-9]{2}:[0-9]{2}:[0-9]{2}";Context
StackExchange Code Review Q#67194, answer score: 4
Revisions (0)
No revisions yet.