HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavaMinor

“Space and Time Efficient” way of finding all anagram groups

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
spacegroupsallefficientwaytimeanagramfindingand

Problem

I have written a program to find all anagram groups in a text file. I used first 26 Prime numbers as a mapper for 26 characters for finding anagram groups (Since character_set of all anagrams of a particular word are same, the products of the anagrams are also same ). After finding product of anagram I put (product, anagrams) pair into HashMap for retrieving later. But I think my solution of retrieving anagrams for particular character_set is not efficient although it is working. I need best practice for my case for iterating through HashMap. Here is the code:

```
public class Find_All_Anagrams_In_File {

// First 26 Primes for corresponding Alphabet letters
private static final int[] PRIMES = new int[] { 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59,
61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113 };

private static long calcProduct(char[] letters) {
long res = 1L;
for (char c: letters) {
if (c findAllAnagrams(String[] file) {

Map anagrams = new HashMap();
Map result = new HashMap();

for (String word: file) {
long product = calcProduct( word.toUpperCase().toCharArray() );
anagrams.put(word, product);
}

StringBuffer sb = new StringBuffer();

for (String word: file) {
long product = calcProduct( word.toUpperCase().toCharArray() );

if (result.get(product) == null)
result.put(product, sb.append("" + word));
else
result.put(product, sb.append(result.get(product) + "," + word));
sb = new StringBuffer();
}
return result;
}

@SuppressWarnings({"rawtypes" })
public static void main(String[] args) {

/*
* for,fro,rof
* tentak,takten
* aks,kas
* xew,wex,xwe
* marza,maraz
* nakra,karan
* simple,mislep
*
*/
String[] file = {"for", "fro", "aks", "ten", "xew", "kas",
"uvn", "marza", "take", "random", "tent

Solution

calcProduct

You don't need the L on these lines:

long res = 1L;
return -1L;


Instead of explaining what 65 is in a comment:

if (c < 65) { // ASCII A = 65


You can use 'A' directly. And checking c 'Z'.

The method rewritten:

private static long calcProduct(char[] letters) {
    long res = 1;
    for (char c : letters) {
        if (c  'Z') {
            return -1;
        }
        int index = c - 'A';
        res = res * PRIMES[index];
    }
    return res;
}


findAllAnagrams

Instead of StringBuffer, it's recommended to use StringBuilder.

Instead of new HashMap() you can use the diamond operator <> to simplify.

The anagrams map is written to but never read. You can remove it.

file is a poor name for an array of words. (words would be better)

The concatenation in sb.append("" + word) is pointless, as sb.append(word) has the same effect.

In general, instead of concatenation inside sb.append(...), it's recommended to chain .append(...).append(...) calls.

Instead of creating a new StringBuffer in every iteration of the loop, you could just create when it's actually needed: when putting the first word in the map.

Finally, it's recommended to always use braces on single-statement if-else conditions.

The method rewritten:

public static Map findAllAnagrams(String[] words) {
    Map result = new HashMap<>();

    for (String word : words) {
        long product = calcProduct(word.toUpperCase().toCharArray());

        StringBuilder sb = result.get(product);
        if (sb == null) {
            result.put(product, new StringBuilder(word));
        } else {
            sb.append(",").append(word);
        }
    }
    return result;
}


But, this is not nearly good enough. It would be better to use a List as the values:

public static Map> findAllAnagrams(String[] words) {
    Map> result = new HashMap<>();

    for (String word : words) {
        long product = calcProduct(word.toUpperCase().toCharArray());

        List anagrams = result.get(product);
        if (anagrams == null) {
            anagrams = new ArrayList<>();
            result.put(product, anagrams);
        }
        anagrams.add(word);
    }
    return result;
}


main

You should not suppress the raw types warning but fix it. The loop could have been written like this:

for (Map.Entry entry : result.entrySet()) {


And since you don't need the keys in the loop body, it's better to iterate over the values:

for (List anagrams : findAllAnagrams(file).values()) {
        if (anagrams.size() > 1) {
            for (String word : anagrams) {
                System.out.print(word + " ");
            }
            System.out.println();
        }
    }

Code Snippets

long res = 1L;
return -1L;
if (c < 65) { // ASCII A = 65
private static long calcProduct(char[] letters) {
    long res = 1;
    for (char c : letters) {
        if (c < 'A' || c > 'Z') {
            return -1;
        }
        int index = c - 'A';
        res = res * PRIMES[index];
    }
    return res;
}
public static Map<Long, StringBuilder> findAllAnagrams(String[] words) {
    Map<Long, StringBuilder> result = new HashMap<>();

    for (String word : words) {
        long product = calcProduct(word.toUpperCase().toCharArray());

        StringBuilder sb = result.get(product);
        if (sb == null) {
            result.put(product, new StringBuilder(word));
        } else {
            sb.append(",").append(word);
        }
    }
    return result;
}
public static Map<Long, List<String>> findAllAnagrams(String[] words) {
    Map<Long, List<String>> result = new HashMap<>();

    for (String word : words) {
        long product = calcProduct(word.toUpperCase().toCharArray());

        List<String> anagrams = result.get(product);
        if (anagrams == null) {
            anagrams = new ArrayList<>();
            result.put(product, anagrams);
        }
        anagrams.add(word);
    }
    return result;
}

Context

StackExchange Code Review Q#122976, answer score: 4

Revisions (0)

No revisions yet.