HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavaMinor

Fastest possible text template for repeated use?

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
templaterepeatedtextpossibleforfastestuse

Problem

While reviewing Sending templatized e-mail to a million contacts, I wrote this implementation to illustrate an alternate approach. It is designed to be the fastest possible way to generate templated text repeatedly. Is it?

I used the in-memory Java compiler featured in this Stack Overflow answer.

I think that the stringLiteral() function and the try-catch block that performs the compilation are rather ugly.

Template.java

import java.io.IOException;
import java.io.Writer;
import java.util.Map;

public interface Template {
    public void write(Writer out, Map params) throws IOException;
}


TemplateCompiler.java

```
import java.io.*;
import java.lang.reflect.Constructor;
import java.util.HashMap;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import javax.tools.*;
import org.mdkt.compiler.InMemoryJavaCompiler;

public class TemplateCompiler {
private static final Pattern SUBST_PAT = Pattern.compile(
"(?.?)(?:\\{\\{(?[^}])\\}\\})"
);

/**
* Instantiates a Template that performs simple
* text substitutions for {{PLACEHOLDERS}}.
*/
public static Template compile(String templateText) {
int rest = 0;
StringBuilder script = new StringBuilder(
"import java.io.IOException;\n" +
"import java.io.Writer;\n" +
"import java.util.Map;\n" +
"public class C implements Template {\n" +
" public void write(Writer out, Map params) throws IOException {\n"
);
for (Matcher m = SUBST_PAT.matcher(templateText); m.find(); rest = m.end()) {
script.append("out.write(")
.append(stringLiteral(m.group("LITERAL")))
.append(");\nout.write(params.get(")
.append(stringLiteral(m.group("SUBST")))
.append("));\n");
}
script.append("out.write(")
.append(stringLiteral(templateText.substring(rest)))

Solution

The compile concept produced incorrect results for me. When I run your code the template does not produce the correct results. For the input parameters:

final Map parms = new HashMap<>();
Stream.of("USER_NAME", "USER_PHONE", "USER_EMAIL", "LOGIN_URL")
      .forEach(tag -> parms.put(tag, tag));


I would expect the input String:

"Dear {{USER_NAME}},\n\n" +
"According to our records, your phone number is {{USER_PHONE}} and " +
"your e-mail address is {{USER_EMAIL}}.  If this is incorrect, please " +
"go to {{LOGIN_URL}} and update your contact information."


to produce:

Dear USER_NAME,

According to our records, your phone number is USER_PHONE and your e-mail address is USER_EMAIL.  If this is incorrect, please go to LOGIN_URL and update your contact information.


But, instead, it produces:

Dear USER_NAMEAccording to our records, your phone number is USER_PHONE and your e-mail address is USER_EMAIL.  If this is incorrect, please go to LOGIN_URL and update your contact information.


I have looked through the code, and I am not sure why it is dropping the newlines, and the comma-punctuation after "USER_NAME".

I looked through the TemplateCompile code, and while I like that you use a Pattern/Matcher to parse the template, the actual loop structure is really complicated. You shoe-horn the process in to a for-loop, when a while-loop would be much better. Additionally, you use a complicated double-matching named-group regular expression, when a single-matching one would be more than adequate.

I particularly dislike the rest variable, and how it is used.

I wonder if this complicated regex logic is the cause of the broken output?

I wrote a "competing" code block, and I also chose regex to parse the template, but my loop is very different:

private static final Pattern token = Pattern.compile("\\{\\{(\\w+)\\}\\}");

public static Template compile(String text) {

    Matcher mat = token.matcher(text);

    int last = 0;

    while (mat.find()) {
        // the non-token text is from the last match end,
        // to this match start
        final String constant = text.substring(last, mat.start());
        // this token's key is the regex group
        final String key = mat.group(1);

        // do stuff with the text and subsequent token
        ....

        last = mat.end();
    }
    final String tail = text.substring(last);
    if (!tail.isEmpty()) {
        // do something with trailing text after last token.
        ....
    }

}


A while loop on the Matcher.find() result is the natural loop constraint.

Instead of compiling the code down, I used an array of text injectors to perform the write. Some injectors inject a constant value, others inject a lookup value from the Parameters. I was able to reduce your class down to much simpler constructs, with no code abstraction and compilation, etc. From a readability and maintenance perspective I believe it is clearly better:

public class MonkeyFix implements Template {

    @FunctionalInterface
    private interface Injector {
        String get(Map params);
    }

    private static final Pattern token = Pattern.compile("\\{\\{(\\w+)\\}\\}");

    public static Template compile(final String text) {
        final Matcher mat = token.matcher(text);
        final List sequence = new ArrayList<>();
        int last = 0;

        while (mat.find()) {
            final String constant = text.substring(last, mat.start());
            final String key = mat.group(1);

            sequence.add(params -> constant);
            sequence.add(params -> params.get(key));

            last = mat.end();
        }

        final String tail = text.substring(last);
        if (!tail.isEmpty()) {
            sequence.add(params -> tail);
        }

        return new MonkeyFix(sequence.toArray(new Injector[sequence.size()]));
    }

    private final Injector[] sequence;

    public MonkeyFix(Injector[] sequence) {
        this.sequence = sequence;
    }

    @Override
    public void write(Writer out, Map params) throws IOException {
        for (Injector lu : sequence) {
            out.write(lu.get(params));
        }
    }

}


How about the performance, though?

I pout the code through my MicroBench suite, using the following code (I had to use a different validation string for your code, I called that one wrong ... ;-) :

```
public class TemplateMain {

private static final String text =
"Dear {{USER_NAME}},\n\n" +
"According to our records, your phone number is {{USER_PHONE}} and " +
"your e-mail address is {{USER_EMAIL}}. If this is incorrect, please " +
"go to {{LOGIN_URL}} and update your contact information.";

private static final Template inmemcomp = TemplateCompiler.compile(text);

private static final Template monkeyfix = MonkeyFix.compile(text);

private static final String inMemFunc(Template t, Map params) {
StringWriter sw = new St

Code Snippets

final Map<String,String> parms = new HashMap<>();
Stream.of("USER_NAME", "USER_PHONE", "USER_EMAIL", "LOGIN_URL")
      .forEach(tag -> parms.put(tag, tag));
"Dear {{USER_NAME}},\n\n" +
"According to our records, your phone number is {{USER_PHONE}} and " +
"your e-mail address is {{USER_EMAIL}}.  If this is incorrect, please " +
"go to {{LOGIN_URL}} and update your contact information."
Dear USER_NAME,

According to our records, your phone number is USER_PHONE and your e-mail address is USER_EMAIL.  If this is incorrect, please go to LOGIN_URL and update your contact information.
Dear USER_NAMEAccording to our records, your phone number is USER_PHONE and your e-mail address is USER_EMAIL.  If this is incorrect, please go to LOGIN_URL and update your contact information.
private static final Pattern token = Pattern.compile("\\{\\{(\\w+)\\}\\}");

public static Template compile(String text) {

    Matcher mat = token.matcher(text);

    int last = 0;

    while (mat.find()) {
        // the non-token text is from the last match end,
        // to this match start
        final String constant = text.substring(last, mat.start());
        // this token's key is the regex group
        final String key = mat.group(1);

        // do stuff with the text and subsequent token
        ....

        last = mat.end();
    }
    final String tail = text.substring(last);
    if (!tail.isEmpty()) {
        // do something with trailing text after last token.
        ....
    }

}

Context

StackExchange Code Review Q#102339, answer score: 9

Revisions (0)

No revisions yet.