HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavaMinor

Soundex algorithm implementation in Java

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
algorithmimplementationjavasoundex

Problem

Just started learning Java Strings. Tried to implement the Soundex algorithm.

Package to hold the String related functions

```
package com.java.strings;

public class StringFunctions {
/**
* Removes all the spaces in a given String.
* E.g: : "A B CDD" becomes "ABCDD"
*/
public static String squeeze (String in) {
String temp = "";
StringBuilder sb = new StringBuilder(in.trim());
int i = 0;

while (i < sb.length()) {
if (sb.charAt(i) == ' ') {
// Starting with the current position, shift all
// the characters from right to left.
for (int j=i; j < sb.length() - 1; j++)
sb.setCharAt(j, sb.charAt(j+1));

// The length of string is reduced by 1
temp = sb.substring(0, sb.length()-1);

sb.setLength(0);
sb.append(temp);
}

// After shifting the characters from right to left, the new
// character in the current position might be a Space. If so,
// the same position has to be processed again .
if (sb.charAt(i) != ' ')
i++;
}
return sb.toString();
}

/**
* Removes Continuous Duplicate characters in a string.
* E.g: "AAABCCCDDDBB" becomes "ABCDB"
*/

public static String removeContDupChars(String in ) {
String temp = "";
StringBuilder sb = new StringBuilder(in);
int i = 0;
char prevChar;

while (i < sb.length()) {
prevChar = sb.charAt(i);
for (int j=i+1; j<sb.length(); j++) {
// As long as there are same characters, Replace all the Duplicates
// with Space.
if (prevChar == sb.charAt(j))
sb.setCharAt(j, ' ');
else
// Where there is a different char, break the inner loop.

Solution

Bugs

The Soundex code for "Jackson" should be "J250". You fail to elide the "C" and the "K", and as a result, your code returns "J225" instead.

The Soundex code for "Wu" should be "W000", and the code for "Google" should be "G240". Your code crashes with a StringIndexOutOfBoundsException for both.

If the input contains a non-alphabetic character, then SoundExClass.getValue() crashes with a NullPointerException due to unboxing a null Character.

Organization

The com.java.strings package name infringes on someone else's namespace — assuming that you are not the owner of the java.com domain.

The SoundExClass class should be public — how else will other people call your code? But …Class is a pretty cumbersome Hungarian suffix. Furthermore, I think that there isn't much point in forcing your users to split what should be a simple function call into an object instantiation and a method call. I would just make it

public class Soundex {
    // Suppress default constructor
    private Soundex() {}

    public static String soundex(String name) {
        …
    }
}


In SoundExClass, variables map, vowels, notvowels, and dropChars should not be public. You don't want any other code to be able to alter their contents. (Note that being final does not make them unmodifiable.)

The getValue() method should be private, since it's an implementation detail that nobody outside the class should be concerned about.

Implementation

removeContDupChars() is superfluous. There is no point in removing consecutive characters in the input string. Just map the characters into their respective digits — you will eventually elide them anyway when you get to // If two or more letters with the same number are adjacent in, only retain the first letter.

squeeze() would be a lot simpler if you took advantage of StringBuilder.deleteCharAt(). I think that the squeeze() function should operate directly on a StringBuilder.

The comments in implementSoundEx() are helpful, but what would be even better is if each small code block were its own function operating on a StringBuilder. That would make the functionality even clearer. Taking a cue from this Haskell solution, I would suggest rewriting implementSoundEx() to look more like this:

public static String soundex(String s) {    
    s = s.toUpperCase().trim();
    if (s.isEmpty()) throw new IllegalArgumentException();

    StringBuilder sb = new StringBuilder(s);
    digitize(sb);
    removeContDupChars(sb);
    squeeze(sb);
    return sb.setCharAt(0, s.charAt(0)).append("000").setLength(4).toString();
}

Code Snippets

public class Soundex {
    // Suppress default constructor
    private Soundex() {}

    public static String soundex(String name) {
        …
    }
}
public static String soundex(String s) {    
    s = s.toUpperCase().trim();
    if (s.isEmpty()) throw new IllegalArgumentException();

    StringBuilder sb = new StringBuilder(s);
    digitize(sb);
    removeContDupChars(sb);
    squeeze(sb);
    return sb.setCharAt(0, s.charAt(0)).append("000").setLength(4).toString();
}

Context

StackExchange Code Review Q#117913, answer score: 4

Revisions (0)

No revisions yet.