patternjavaMinor
“Space and Time Efficient” way of finding all anagram groups
Viewed 0 times
spacegroupsallefficientwaytimeanagramfindingand
Problem
I have written a program to find all anagram groups in a text file. I used first 26 Prime numbers as a mapper for 26 characters for finding anagram groups (Since character_set of all anagrams of a particular word are same, the products of the anagrams are also same ). After finding product of anagram I put (product, anagrams) pair into HashMap for retrieving later. But I think my solution of retrieving anagrams for particular character_set is not efficient although it is working. I need best practice for my case for iterating through HashMap. Here is the code:
```
public class Find_All_Anagrams_In_File {
// First 26 Primes for corresponding Alphabet letters
private static final int[] PRIMES = new int[] { 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59,
61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113 };
private static long calcProduct(char[] letters) {
long res = 1L;
for (char c: letters) {
if (c findAllAnagrams(String[] file) {
Map anagrams = new HashMap();
Map result = new HashMap();
for (String word: file) {
long product = calcProduct( word.toUpperCase().toCharArray() );
anagrams.put(word, product);
}
StringBuffer sb = new StringBuffer();
for (String word: file) {
long product = calcProduct( word.toUpperCase().toCharArray() );
if (result.get(product) == null)
result.put(product, sb.append("" + word));
else
result.put(product, sb.append(result.get(product) + "," + word));
sb = new StringBuffer();
}
return result;
}
@SuppressWarnings({"rawtypes" })
public static void main(String[] args) {
/*
* for,fro,rof
* tentak,takten
* aks,kas
* xew,wex,xwe
* marza,maraz
* nakra,karan
* simple,mislep
*
*/
String[] file = {"for", "fro", "aks", "ten", "xew", "kas",
"uvn", "marza", "take", "random", "tent
```
public class Find_All_Anagrams_In_File {
// First 26 Primes for corresponding Alphabet letters
private static final int[] PRIMES = new int[] { 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59,
61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113 };
private static long calcProduct(char[] letters) {
long res = 1L;
for (char c: letters) {
if (c findAllAnagrams(String[] file) {
Map anagrams = new HashMap();
Map result = new HashMap();
for (String word: file) {
long product = calcProduct( word.toUpperCase().toCharArray() );
anagrams.put(word, product);
}
StringBuffer sb = new StringBuffer();
for (String word: file) {
long product = calcProduct( word.toUpperCase().toCharArray() );
if (result.get(product) == null)
result.put(product, sb.append("" + word));
else
result.put(product, sb.append(result.get(product) + "," + word));
sb = new StringBuffer();
}
return result;
}
@SuppressWarnings({"rawtypes" })
public static void main(String[] args) {
/*
* for,fro,rof
* tentak,takten
* aks,kas
* xew,wex,xwe
* marza,maraz
* nakra,karan
* simple,mislep
*
*/
String[] file = {"for", "fro", "aks", "ten", "xew", "kas",
"uvn", "marza", "take", "random", "tent
Solution
calcProductYou don't need the
L on these lines:long res = 1L;
return -1L;Instead of explaining what 65 is in a comment:
if (c < 65) { // ASCII A = 65You can use
'A' directly. And checking c 'Z'.The method rewritten:
private static long calcProduct(char[] letters) {
long res = 1;
for (char c : letters) {
if (c 'Z') {
return -1;
}
int index = c - 'A';
res = res * PRIMES[index];
}
return res;
}findAllAnagramsInstead of
StringBuffer, it's recommended to use StringBuilder.Instead of
new HashMap() you can use the diamond operator <> to simplify.The
anagrams map is written to but never read. You can remove it.file is a poor name for an array of words. (words would be better)The concatenation in
sb.append("" + word) is pointless, as sb.append(word) has the same effect.In general, instead of concatenation inside
sb.append(...), it's recommended to chain .append(...).append(...) calls.Instead of creating a new
StringBuffer in every iteration of the loop, you could just create when it's actually needed: when putting the first word in the map.Finally, it's recommended to always use braces on single-statement if-else conditions.
The method rewritten:
public static Map findAllAnagrams(String[] words) {
Map result = new HashMap<>();
for (String word : words) {
long product = calcProduct(word.toUpperCase().toCharArray());
StringBuilder sb = result.get(product);
if (sb == null) {
result.put(product, new StringBuilder(word));
} else {
sb.append(",").append(word);
}
}
return result;
}But, this is not nearly good enough. It would be better to use a
List as the values:public static Map> findAllAnagrams(String[] words) {
Map> result = new HashMap<>();
for (String word : words) {
long product = calcProduct(word.toUpperCase().toCharArray());
List anagrams = result.get(product);
if (anagrams == null) {
anagrams = new ArrayList<>();
result.put(product, anagrams);
}
anagrams.add(word);
}
return result;
}mainYou should not suppress the raw types warning but fix it. The loop could have been written like this:
for (Map.Entry entry : result.entrySet()) {And since you don't need the keys in the loop body, it's better to iterate over the values:
for (List anagrams : findAllAnagrams(file).values()) {
if (anagrams.size() > 1) {
for (String word : anagrams) {
System.out.print(word + " ");
}
System.out.println();
}
}Code Snippets
long res = 1L;
return -1L;if (c < 65) { // ASCII A = 65private static long calcProduct(char[] letters) {
long res = 1;
for (char c : letters) {
if (c < 'A' || c > 'Z') {
return -1;
}
int index = c - 'A';
res = res * PRIMES[index];
}
return res;
}public static Map<Long, StringBuilder> findAllAnagrams(String[] words) {
Map<Long, StringBuilder> result = new HashMap<>();
for (String word : words) {
long product = calcProduct(word.toUpperCase().toCharArray());
StringBuilder sb = result.get(product);
if (sb == null) {
result.put(product, new StringBuilder(word));
} else {
sb.append(",").append(word);
}
}
return result;
}public static Map<Long, List<String>> findAllAnagrams(String[] words) {
Map<Long, List<String>> result = new HashMap<>();
for (String word : words) {
long product = calcProduct(word.toUpperCase().toCharArray());
List<String> anagrams = result.get(product);
if (anagrams == null) {
anagrams = new ArrayList<>();
result.put(product, anagrams);
}
anagrams.add(word);
}
return result;
}Context
StackExchange Code Review Q#122976, answer score: 4
Revisions (0)
No revisions yet.