patternjavaMinor
Fastest possible text template for repeated use?
Viewed 0 times
templaterepeatedtextpossibleforfastestuse
Problem
While reviewing Sending templatized e-mail to a million contacts, I wrote this implementation to illustrate an alternate approach. It is designed to be the fastest possible way to generate templated text repeatedly. Is it?
I used the in-memory Java compiler featured in this Stack Overflow answer.
I think that the
Template.java
TemplateCompiler.java
```
import java.io.*;
import java.lang.reflect.Constructor;
import java.util.HashMap;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import javax.tools.*;
import org.mdkt.compiler.InMemoryJavaCompiler;
public class TemplateCompiler {
private static final Pattern SUBST_PAT = Pattern.compile(
"(?.?)(?:\\{\\{(?[^}])\\}\\})"
);
/**
* Instantiates a Template that performs simple
* text substitutions for {{PLACEHOLDERS}}.
*/
public static Template compile(String templateText) {
int rest = 0;
StringBuilder script = new StringBuilder(
"import java.io.IOException;\n" +
"import java.io.Writer;\n" +
"import java.util.Map;\n" +
"public class C implements Template {\n" +
" public void write(Writer out, Map params) throws IOException {\n"
);
for (Matcher m = SUBST_PAT.matcher(templateText); m.find(); rest = m.end()) {
script.append("out.write(")
.append(stringLiteral(m.group("LITERAL")))
.append(");\nout.write(params.get(")
.append(stringLiteral(m.group("SUBST")))
.append("));\n");
}
script.append("out.write(")
.append(stringLiteral(templateText.substring(rest)))
I used the in-memory Java compiler featured in this Stack Overflow answer.
I think that the
stringLiteral() function and the try-catch block that performs the compilation are rather ugly.Template.java
import java.io.IOException;
import java.io.Writer;
import java.util.Map;
public interface Template {
public void write(Writer out, Map params) throws IOException;
}TemplateCompiler.java
```
import java.io.*;
import java.lang.reflect.Constructor;
import java.util.HashMap;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import javax.tools.*;
import org.mdkt.compiler.InMemoryJavaCompiler;
public class TemplateCompiler {
private static final Pattern SUBST_PAT = Pattern.compile(
"(?.?)(?:\\{\\{(?[^}])\\}\\})"
);
/**
* Instantiates a Template that performs simple
* text substitutions for {{PLACEHOLDERS}}.
*/
public static Template compile(String templateText) {
int rest = 0;
StringBuilder script = new StringBuilder(
"import java.io.IOException;\n" +
"import java.io.Writer;\n" +
"import java.util.Map;\n" +
"public class C implements Template {\n" +
" public void write(Writer out, Map params) throws IOException {\n"
);
for (Matcher m = SUBST_PAT.matcher(templateText); m.find(); rest = m.end()) {
script.append("out.write(")
.append(stringLiteral(m.group("LITERAL")))
.append(");\nout.write(params.get(")
.append(stringLiteral(m.group("SUBST")))
.append("));\n");
}
script.append("out.write(")
.append(stringLiteral(templateText.substring(rest)))
Solution
The compile concept produced incorrect results for me. When I run your code the template does not produce the correct results. For the input parameters:
I would expect the input String:
to produce:
But, instead, it produces:
I have looked through the code, and I am not sure why it is dropping the newlines, and the comma-punctuation after "USER_NAME".
I looked through the TemplateCompile code, and while I like that you use a Pattern/Matcher to parse the template, the actual loop structure is really complicated. You shoe-horn the process in to a for-loop, when a while-loop would be much better. Additionally, you use a complicated double-matching named-group regular expression, when a single-matching one would be more than adequate.
I particularly dislike the
I wonder if this complicated regex logic is the cause of the broken output?
I wrote a "competing" code block, and I also chose regex to parse the template, but my loop is very different:
A while loop on the
Instead of compiling the code down, I used an array of text injectors to perform the write. Some injectors inject a constant value, others inject a lookup value from the Parameters. I was able to reduce your class down to much simpler constructs, with no code abstraction and compilation, etc. From a readability and maintenance perspective I believe it is clearly better:
How about the performance, though?
I pout the code through my MicroBench suite, using the following code (I had to use a different validation string for your code, I called that one
```
public class TemplateMain {
private static final String text =
"Dear {{USER_NAME}},\n\n" +
"According to our records, your phone number is {{USER_PHONE}} and " +
"your e-mail address is {{USER_EMAIL}}. If this is incorrect, please " +
"go to {{LOGIN_URL}} and update your contact information.";
private static final Template inmemcomp = TemplateCompiler.compile(text);
private static final Template monkeyfix = MonkeyFix.compile(text);
private static final String inMemFunc(Template t, Map params) {
StringWriter sw = new St
final Map parms = new HashMap<>();
Stream.of("USER_NAME", "USER_PHONE", "USER_EMAIL", "LOGIN_URL")
.forEach(tag -> parms.put(tag, tag));I would expect the input String:
"Dear {{USER_NAME}},\n\n" +
"According to our records, your phone number is {{USER_PHONE}} and " +
"your e-mail address is {{USER_EMAIL}}. If this is incorrect, please " +
"go to {{LOGIN_URL}} and update your contact information."to produce:
Dear USER_NAME,
According to our records, your phone number is USER_PHONE and your e-mail address is USER_EMAIL. If this is incorrect, please go to LOGIN_URL and update your contact information.But, instead, it produces:
Dear USER_NAMEAccording to our records, your phone number is USER_PHONE and your e-mail address is USER_EMAIL. If this is incorrect, please go to LOGIN_URL and update your contact information.I have looked through the code, and I am not sure why it is dropping the newlines, and the comma-punctuation after "USER_NAME".
I looked through the TemplateCompile code, and while I like that you use a Pattern/Matcher to parse the template, the actual loop structure is really complicated. You shoe-horn the process in to a for-loop, when a while-loop would be much better. Additionally, you use a complicated double-matching named-group regular expression, when a single-matching one would be more than adequate.
I particularly dislike the
rest variable, and how it is used.I wonder if this complicated regex logic is the cause of the broken output?
I wrote a "competing" code block, and I also chose regex to parse the template, but my loop is very different:
private static final Pattern token = Pattern.compile("\\{\\{(\\w+)\\}\\}");
public static Template compile(String text) {
Matcher mat = token.matcher(text);
int last = 0;
while (mat.find()) {
// the non-token text is from the last match end,
// to this match start
final String constant = text.substring(last, mat.start());
// this token's key is the regex group
final String key = mat.group(1);
// do stuff with the text and subsequent token
....
last = mat.end();
}
final String tail = text.substring(last);
if (!tail.isEmpty()) {
// do something with trailing text after last token.
....
}
}A while loop on the
Matcher.find() result is the natural loop constraint.Instead of compiling the code down, I used an array of text injectors to perform the write. Some injectors inject a constant value, others inject a lookup value from the Parameters. I was able to reduce your class down to much simpler constructs, with no code abstraction and compilation, etc. From a readability and maintenance perspective I believe it is clearly better:
public class MonkeyFix implements Template {
@FunctionalInterface
private interface Injector {
String get(Map params);
}
private static final Pattern token = Pattern.compile("\\{\\{(\\w+)\\}\\}");
public static Template compile(final String text) {
final Matcher mat = token.matcher(text);
final List sequence = new ArrayList<>();
int last = 0;
while (mat.find()) {
final String constant = text.substring(last, mat.start());
final String key = mat.group(1);
sequence.add(params -> constant);
sequence.add(params -> params.get(key));
last = mat.end();
}
final String tail = text.substring(last);
if (!tail.isEmpty()) {
sequence.add(params -> tail);
}
return new MonkeyFix(sequence.toArray(new Injector[sequence.size()]));
}
private final Injector[] sequence;
public MonkeyFix(Injector[] sequence) {
this.sequence = sequence;
}
@Override
public void write(Writer out, Map params) throws IOException {
for (Injector lu : sequence) {
out.write(lu.get(params));
}
}
}How about the performance, though?
I pout the code through my MicroBench suite, using the following code (I had to use a different validation string for your code, I called that one
wrong ... ;-) :```
public class TemplateMain {
private static final String text =
"Dear {{USER_NAME}},\n\n" +
"According to our records, your phone number is {{USER_PHONE}} and " +
"your e-mail address is {{USER_EMAIL}}. If this is incorrect, please " +
"go to {{LOGIN_URL}} and update your contact information.";
private static final Template inmemcomp = TemplateCompiler.compile(text);
private static final Template monkeyfix = MonkeyFix.compile(text);
private static final String inMemFunc(Template t, Map params) {
StringWriter sw = new St
Code Snippets
final Map<String,String> parms = new HashMap<>();
Stream.of("USER_NAME", "USER_PHONE", "USER_EMAIL", "LOGIN_URL")
.forEach(tag -> parms.put(tag, tag));"Dear {{USER_NAME}},\n\n" +
"According to our records, your phone number is {{USER_PHONE}} and " +
"your e-mail address is {{USER_EMAIL}}. If this is incorrect, please " +
"go to {{LOGIN_URL}} and update your contact information."Dear USER_NAME,
According to our records, your phone number is USER_PHONE and your e-mail address is USER_EMAIL. If this is incorrect, please go to LOGIN_URL and update your contact information.Dear USER_NAMEAccording to our records, your phone number is USER_PHONE and your e-mail address is USER_EMAIL. If this is incorrect, please go to LOGIN_URL and update your contact information.private static final Pattern token = Pattern.compile("\\{\\{(\\w+)\\}\\}");
public static Template compile(String text) {
Matcher mat = token.matcher(text);
int last = 0;
while (mat.find()) {
// the non-token text is from the last match end,
// to this match start
final String constant = text.substring(last, mat.start());
// this token's key is the regex group
final String key = mat.group(1);
// do stuff with the text and subsequent token
....
last = mat.end();
}
final String tail = text.substring(last);
if (!tail.isEmpty()) {
// do something with trailing text after last token.
....
}
}Context
StackExchange Code Review Q#102339, answer score: 9
Revisions (0)
No revisions yet.