HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavaMinor

Determining the existence of string Intersections

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
determiningtheexistenceintersectionsstring

Problem

Problem:


Given 2 strings, consider all the substrings within them
of length len. Len will be 1 or more.
Returns true if there are any such substrings which appear in both strings.
Compute this in linear time using a HashSet.

Solution:

import java.util.HashSet;

public class Standford3 {
    public static void main(String[] args) {
        System.out.println("Test 1: " +
            (false  == stringIntersect("blahblah", "foralheh", 3))
        );
        System.out.println("Test 2: " +
            (true  == stringIntersect("checking", "deck", 2))
        );
        System.out.println("Test 3: " +
            (false  == stringIntersect("derping", "slurp", 3))
        );
        System.out.println("Test 4: " +
            (false  == stringIntersect("foo", "bar", 1))
        );
        System.out.println("Test 5: " +
            (true  == stringIntersect("nowai", "55&dcsnow", 3))
        );
    }

    public static boolean stringIntersect(String a, String b, int len) {
        if (a.length() == 0 || b.length() == 0) { return false; }

        HashSet alpha = permutateString(a, len);
        HashSet beta = permutateString(b, len);

        for (String s : alpha) {
            if (beta.contains(s)) { return true; }
        }

        return false;
    }

    public static HashSet permutateString(String str, int i) {
        if (i > str.length()) {
            throw new IllegalArgumentException(
                "Substring length cannot be larger than provided string"
            );
        }

        HashSet set = new HashSet<>();
        int count = i;

        for (int j = 0; j  str.length()) { break; }

            set.add(str.substring(j, count));
            count++;
        }

        return set;
    }
}


-
Are these tests sufficient? Unlike the two challenges that preceded it, the tests are mine; is there anything I should be testing for that I missed?

-
I hope this isn't outside of codereview territory, but I'm wondering if something more

Solution

Some simple suggestions for refactoring on your current code (without going into the implementation)

  • Use Set over HashSet



Your original question says "Compute this in linear time using a HashSet.", but that usually doesn't mean you need to use HashSet as the return type. Set result = new HashSet<>() works better in the sense that callers of your method will not need to know the underlying implementation. Also, you'll be free to change it to LinkedHashSet without changing the method signature.

  • Variable names



Two points here.

i is usually used in for loops, so do consider using a better name in the method signature, e.g. permutateString(String str, int length).

Your count isn't actually counting anything, but to keep track of the last exclusive index for substring-ing (I think I just made a new word up!). For people who don't use String.substring() often, that line of code will instead read like you are creating sub-strings of increasing lengths.

-
Looping

for (int j = 0; j  str.length()) { break; }
    ...
}


This can be better expressed as:

for (int j = 0; j <= str.length() - length; j++ ) {
    result.add(str.substring(j, j + length));
}


Illustration:

1234567890
123
 234
  345
   456
    567
     678
      789
       890


So, to extract 3-character sub-strings from a 10-character string, we need to loop from i = 0 to i = 7, or `i

-
Unit testing

As mentioned elsewhere, please consider using unit testing frameworks like JUnit or TestNG. :)

Code Snippets

for (int j = 0; j < str.length(); j++ ) {
    if (count > str.length()) { break; }
    ...
}
for (int j = 0; j <= str.length() - length; j++ ) {
    result.add(str.substring(j, j + length));
}
1234567890
123
 234
  345
   456
    567
     678
      789
       890

Context

StackExchange Code Review Q#75454, answer score: 4

Revisions (0)

No revisions yet.