HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavaMinor

Synonymiser refactoring

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
refactoringsynonymiserstackoverflow

Problem

I am interested in seeing how people would refactor this code to make it more readable, remove local variables, etc. The code checks a string against a number of arrays. The first (0) field of each array is the root word and the rest of the words are its synonyms. The idea is to replace any word in the user-provided string with the root word or provide the user-supplied word if none were found.

import java.util.Arrays;

public class Synonymiser {

    private final static int ROOT_WORD = 0;

    private String[][] mSynonyms;

    public Synonymiser(String[]... synonyms) {
        super();
        this.mSynonyms = synonyms;
    }

    public String normaliseString(String inStr){
        String[] inArr = inStr.split(" ");
        boolean matchFound;

        for (int i = 0; i < inArr.length; i++){
            matchFound=false;
            for (String[] synonymArr : mSynonyms){
                 for(int j = 0 ; j < synonymArr.length; j++){
                     if (inArr[i].equalsIgnoreCase(synonymArr[j])){
                         inArr[i] = synonymArr[ROOT_WORD];
                         matchFound = true;
                     }
                     if (matchFound) break; 
                 }
                 if (matchFound)  break; 
            }
        }
        return Arrays.toString(inArr);
    }

    public static void main(String [] args){
        String[] ar1 = new String[] {"one", "two", "three", "four", "five"};
        String[] ar2 = new String[] {"six", "seven", "eight"};
        Synonymiser s = new Synonymiser(ar1, ar2);

        System.out.println(s.normaliseString("one two seven bulb"));
    }

}

Solution

I would like to go beyond just making the code prettier, and change the method that you are using. Instead of looping through the arrays looking for matches, you can put the synonyms in a hashed list. That will give you a performance close to O(1) for each lookup, instead of O(n).

Example in C#:

public class Synonymiser {

  private Dictionary _synonyms = new Dictionary();

  public Synonymiser AddWord(string word, params string[] synonyms) {
    foreach (string synonym in synonyms) {
      _synonyms.Add(synonym.ToUpper(), word);
    }
    return this;
  }

  public string NormaliseString(string inStr){
    string replacement;
    return String.Join(
      " ",
      inStr.Split(' ')
      .Select(w => _synonyms.TryGetValue(w.ToUpper(), out replacement) ? replacement : w)
    );
  }

}


Usage:

Synonymiser s =
  new Synonymiser()
  .AddWord("one", "two", "three", "four", "five")
  .AddWord("six", "seven", "eight");

Console.WriteLine(s.NormaliseString("one two seven bulb"));


Output:

one one six bulb

Code Snippets

public class Synonymiser {

  private Dictionary<string, string> _synonyms = new Dictionary<string, string>();

  public Synonymiser AddWord(string word, params string[] synonyms) {
    foreach (string synonym in synonyms) {
      _synonyms.Add(synonym.ToUpper(), word);
    }
    return this;
  }

  public string NormaliseString(string inStr){
    string replacement;
    return String.Join(
      " ",
      inStr.Split(' ')
      .Select(w => _synonyms.TryGetValue(w.ToUpper(), out replacement) ? replacement : w)
    );
  }

}
Synonymiser s =
  new Synonymiser()
  .AddWord("one", "two", "three", "four", "five")
  .AddWord("six", "seven", "eight");

Console.WriteLine(s.NormaliseString("one two seven bulb"));
one one six bulb

Context

StackExchange Code Review Q#7036, answer score: 5

Revisions (0)

No revisions yet.