patternjavaMinor
Sending templatized e-mail to a million contacts
Viewed 0 times
millioncontactssendingmailtemplatized
Problem
This code runs fine without a bug, I need to optimize this code for following interview requirement.
Lets say i need to send it to 1 million contacts and emailBody is ~100Kb.
What code optimization do you recommend to make it faster ?
Write code or algorithm to explain the optimization you think will make this code faster, let's say 10x, 100x ?
My version is:
```
public String sendEmailUsingStringBuilder(String emailBody, Email email) {
HashMap dataHash = new HashMap();
dataHash.put("{{USER_EMAIL}}", email.getEmail());
dataHash.put("{{USER_NAME}}", email.getName());
dataHash.put("{{USER_PHONE}}", email.getPhone());
dataHash.put("{{LOGIN_URL}}", email.getLoginUrl());
StringBuilder strBuilder = new StringBuilder(emailBody);
for ( String tag : dataHash.keySet() ) {
int tagIndex;
while ((tagIndex = strBuilder.indexOf(tag)) != -1) {
strBuilder.replace(tagIndex, tagIndex + tag.length(),
(String) dataHash.get(tag));
tagInd
Lets say i need to send it to 1 million contacts and emailBody is ~100Kb.
What code optimization do you recommend to make it faster ?
Write code or algorithm to explain the optimization you think will make this code faster, let's say 10x, 100x ?
public String sendEmail(String emailBody, Email email) {
Hashtable dataHash = new Hashtable();
dataHash.put("{{USER_EMAIL}}", email.getEmail());
dataHash.put("{{USER_NAME}}", email.getName());
dataHash.put("{{USER_PHONE}}", email.getPhone());
dataHash.put("{{LOGIN_URL}}", email.getLoginUrl());
Enumeration tagList = dataHash.keys();
while (tagList.hasMoreElements()) {
String tag = tagList.nextElement();
while (emailBody.indexOf(tag) != -1) {
int indexOf = emailBody.toLowerCase()
.indexOf(tag.toLowerCase());
String part1 = emailBody.substring(0, indexOf);
String part2 = emailBody.substring(indexOf + tag.length(),
emailBody.length());
emailBody = part1 + (String) dataHash.get(tag) + part2;
}
}
return emailBody;
}My version is:
```
public String sendEmailUsingStringBuilder(String emailBody, Email email) {
HashMap dataHash = new HashMap();
dataHash.put("{{USER_EMAIL}}", email.getEmail());
dataHash.put("{{USER_NAME}}", email.getName());
dataHash.put("{{USER_PHONE}}", email.getPhone());
dataHash.put("{{LOGIN_URL}}", email.getLoginUrl());
StringBuilder strBuilder = new StringBuilder(emailBody);
for ( String tag : dataHash.keySet() ) {
int tagIndex;
while ((tagIndex = strBuilder.indexOf(tag)) != -1) {
strBuilder.replace(tagIndex, tagIndex + tag.length(),
(String) dataHash.get(tag));
tagInd
Solution
The big picture
First of all, consider the possibility that this is a trick question. When sending bulk e-mail, the performance bottleneck is likely to be with the Mail Transfer Agent or the database query that fetches the contacts rather than with the text generation. I would raise a concern with the interviewer that perhaps we should gather some evidence before making a premature effort to optimize the wrong part of the process. Optimizing the code in this question (which takes about 2 seconds to run) will not make any noticeable difference to the time necessary to conduct the e-mail campaign.
Review
That said, if you had to optimize the text generation process, your changes would be a mild improvement. The
Strictly speaking, both the original code and your version of it are buggy. Because string substitution is being done in multiple passes, it is possible that the output from one round of substitution could introduce a string that looks like a placeholder that ends up being inappropriately substituted in a subsequent round. (Another example of this class of bug.) Here, it's unlikely to be a problem unless someone decided to deliberately exploit the bug.
Alternative approach
If I really had to optimize this code, I would take a different approach entirely. The same
… into code that looks like this:
The performance is likely to be unbeatable, since there is no longer any string substitution going on. This is, in fact, very similar to the code that a JSP engine generates, and you could try to integrate a JSP implementation instead of reinventing the wheel.
As a bonus, you can stack layers of writers. For example, you might pass it a subclass of
When I tried generating a million messages with a pre-compiled template, it completed the task in half the time of your code. If you include the time to compile the template into executable form, it's still better than break-even for one million messages. For a hundred million messages, it's about four times faster, compilation overhead included.
First of all, consider the possibility that this is a trick question. When sending bulk e-mail, the performance bottleneck is likely to be with the Mail Transfer Agent or the database query that fetches the contacts rather than with the text generation. I would raise a concern with the interviewer that perhaps we should gather some evidence before making a premature effort to optimize the wrong part of the process. Optimizing the code in this question (which takes about 2 seconds to run) will not make any noticeable difference to the time necessary to conduct the e-mail campaign.
Review
That said, if you had to optimize the text generation process, your changes would be a mild improvement. The
HashMap is more lightweight than the synchronized Hashtable, and the StringBuilder does save a bit of copying, though not by as much as you think, since you are still making four global substitution passes through the entire emailBody.Strictly speaking, both the original code and your version of it are buggy. Because string substitution is being done in multiple passes, it is possible that the output from one round of substitution could introduce a string that looks like a placeholder that ends up being inappropriately substituted in a subsequent round. (Another example of this class of bug.) Here, it's unlikely to be a problem unless someone decided to deliberately exploit the bug.
Alternative approach
If I really had to optimize this code, I would take a different approach entirely. The same
emailBody template is likely to used for a million emails. Therefore, it would make sense to "pre-compile" the template…Dear {{USER_NAME}},
According to our records, your phone number is {{USER_PHONE}} and
your e-mail address is {{USER_EMAIL}}. If this is incorrect, please
go to {{LOGIN_URL}} and update your contact information.
… into code that looks like this:
public class C implements Template {
public void write(Writer out, Map params) throws IOException {
out.write("Dear ");
out.write(params.get("USER_NAME"));
out.write("\n\nAccording to our records, your phone number is ");
out.write(params.get("USER_PHONE"));
out.write(" and \nyour e-mail address is ");
out.write(params.get("USER_EMAIL"));
out.write(". If this is incorrect, please \ngo to ");
out.write(params.get("LOGIN_URL"));
out.write(" and update your contact information.\n");
out.flush();
}
}The performance is likely to be unbeatable, since there is no longer any string substitution going on. This is, in fact, very similar to the code that a JSP engine generates, and you could try to integrate a JSP implementation instead of reinventing the wheel.
As a bonus, you can stack layers of writers. For example, you might pass it a subclass of
FilterWriter that performs Format=Flowed line wrapping as per RFC 2646 to keep line lengths to a reasonable length as required by SMTP. You could also stack that directly on top of a Writer that feeds the text directly to an SMTP socket. Or, to obtain the resulting text as a string as in the original code, just use a StringWriter.When I tried generating a million messages with a pre-compiled template, it completed the task in half the time of your code. If you include the time to compile the template into executable form, it's still better than break-even for one million messages. For a hundred million messages, it's about four times faster, compilation overhead included.
Code Snippets
public class C implements Template {
public void write(Writer out, Map<String, String> params) throws IOException {
out.write("Dear ");
out.write(params.get("USER_NAME"));
out.write("\n\nAccording to our records, your phone number is ");
out.write(params.get("USER_PHONE"));
out.write(" and \nyour e-mail address is ");
out.write(params.get("USER_EMAIL"));
out.write(". If this is incorrect, please \ngo to ");
out.write(params.get("LOGIN_URL"));
out.write(" and update your contact information.\n");
out.flush();
}
}Context
StackExchange Code Review Q#102137, answer score: 4
Revisions (0)
No revisions yet.