HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavaMinor

Determining if two Strings have common subtrings of a given length in Java

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
determininglengthgivenjavasubtringstwostringscommonhave

Problem

Problem:


Given 2 strings, consider all the substrings within them of length
len. Len will be 1 or more. Returns true if there are any such
substrings which appear in both strings. Compute this in linear time
using a HashSet.

I know as a matter of fact that there are many areas upon which my code could be improved. However, rather than asking other people to correct my code, I would much prefer to be given hints or general principles as to how my code could be improved. I think this way, me, and everyone that read this post could get more out of this exercise this way.

So could someone please provide me with some comments or suggestions? Critics of any level and kind are welcomed! Feel free to pick apart my code!

Class:

```
import java.util.HashSet;
import java.util.Set;

// CS108 HW1 -- String static methods
public class StringCode {

public static boolean stringIntersect(String firstString, String secondString, int lengthOfSubstring) {

boolean sameSubstringInBothStrings = false;

Set setOne = StringCode.getSubstringsSet(firstString, lengthOfSubstring);
Set setTwo = StringCode.getSubstringsSet(secondString, lengthOfSubstring);

//compare setOne and setTwo to find out if there is any matching elements
outerForLoop:
for(String aSubstringInSetOne : setOne){
for(String aSubstringInSetTwo : setTwo){
if(aSubstringInSetOne.equals(aSubstringInSetTwo)){
sameSubstringInBothStrings = true;
break outerForLoop;
}
}
}

return sameSubstringInBothStrings;
}

static Set getSubstringsSet(String aString, int lengthOfSubstring){
Set setOfSubstrings = new HashSet<>();

if(aString.length() < lengthOfSubstring){
return setOfSubstrings;
}

if(aString.length() == lengthOfSubstring){
setOfSubstrings.add(aString);
return setOfSubstrings;

Solution

Don't add unnecessary pre-checks

Your method getSubstringsSet starts with two early returns:

if(aString.length() < lengthOfSubstring){
    return setOfSubstrings;
}

if(aString.length() == lengthOfSubstring){
    setOfSubstrings.add(aString);
    return setOfSubstrings;
}


You actually don't need them. The rest of the code handles both cases just fine and it doesn't add to clarity or improve performance. Early-returns should be used when the rest of the code can then assume a condition that was checked, making its code simpler; but it is not the case here.
Use built-in methods

To add the substrings, you are currently doing:

StringBuilder sb = new StringBuilder();

//add each substring of length (lengthOfSubstring) to the setOfSubstrings. 
for(int j = 0; j < lengthOfSubstring; j++){
    sb.append(charArray[i + j]);
}

setOfSubstrings.add(sb.toString());


So you're creating a new StringBuilder and then appending each character. This is a complicated way of retrieving the substring between two indices. It also allocates a new StringBuilder each time which isn't really needed.

You can just have

setOfSubstrings.add(aString.substring(i, i + lengthOfSubstring));


In the same way, stringIntersect is a lot more complicated than it should be.

First of all, you are using a label outerForLoop and then using it to break out of the inner loop with break outerForLoop;. This is generally not a good practice. What this shows is a missing method: what you really want is to make a method isIntersection(set1, set2) that would determine if the two sets have elements in common.

However, you don't even need to write one, since it is already built-in: Collections#disjoint. This method returns true if the two given collections are disjoint, i.e. have no elements in common. Therefore, you can remove your double for loop and just have:

boolean sameSubstringInBothStrings = !Collections.disjoint(setOne, setTwo);

Code Snippets

if(aString.length() < lengthOfSubstring){
    return setOfSubstrings;
}

if(aString.length() == lengthOfSubstring){
    setOfSubstrings.add(aString);
    return setOfSubstrings;
}
StringBuilder sb = new StringBuilder();

//add each substring of length (lengthOfSubstring) to the setOfSubstrings. 
for(int j = 0; j < lengthOfSubstring; j++){
    sb.append(charArray[i + j]);
}

setOfSubstrings.add(sb.toString());
setOfSubstrings.add(aString.substring(i, i + lengthOfSubstring));
boolean sameSubstringInBothStrings = !Collections.disjoint(setOne, setTwo);

Context

StackExchange Code Review Q#130023, answer score: 3

Revisions (0)

No revisions yet.