HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavaMinor

Split string by character in Java

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
javasplitstringcharacter

Problem

I have this code, which goes through a string and split it using a char delimiter.

I am concerned where it can be improved.

  • Main concerns are speed and memory usage with speed as the priority.



  • Any other improvements are welcome too.



  • Is there a better way (Without regex based split and without 3rd party libraries).



What it does:



  • Iterate through string and extract all portions of strings delimited by given char



  • return all portions, including empty ones.



  • StringUtility class name is used because in future I can add other utility methods when necessory




Code:

public class StringUtility {

    /**
     * Split a string using a single char delimiter
     * @param strToSplit string to use
     * @param delimiter delimiter
     * @return String[] of portions
     */
    public static String[] split(String strToSplit, char delimiter) {
        ArrayList arr = new ArrayList();
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < strToSplit.length(); i++) {
            char at = strToSplit.charAt(i);
            if (at == delimiter) {
                arr.add(sb.toString());
                sb = new StringBuilder();
            } else {
                sb.append(at);
            }
        }
        arr.add(sb.toString());
        return arr.toArray(new String[0]);
    }
}


Unit Test:

This code currently passes following jUnit Test Cases

```
public class StringUtilityTest {

@Test
public void test1() {
final String str1 = "Because";
final String str2 = "I'm";
final String str3 = "Batman";
final char delim = ' ';
String[] parts = StringUtility.split(str1 + delim + str2 + delim + str3,
delim);
Assert.assertTrue(parts.length == 3);
Assert.assertTrue(parts[0].equals(str1));
Assert.assertTrue(parts[1].equals(str2));
Assert.assertTrue(parts[2].equals(str3));
}

@Test
public void test2() {
final Strin

Solution

Sloppiness

There are several careless mistakes that are easy to fix.

Don't use implementation types in declaration, unless you really need to use a specific feature of that implementation:

ArrayList arr = new ArrayList();


Use the interface type instead:

List arr = new ArrayList();


Instead of creating a new StringBuilder, it might be faster to reset it, like this:

sb.setLength(0);


See this discussion for more details.

Don't convert a List to an array like this:

return arr.toArray(new String[0]);


It's recommended to pass a correctly sized array:

return arr.toArray(new String[arr.size()]);


Better unit tests

Give your test cases meaningful names. So that when your IDE reports some of them failing, it will be easier to understand what failed. For example:

@Test
public void testNormalSentence() {
    final String str1 = "Because";
    final String str2 = "I'm";
    final String str3 = "Batman";
    final char delim = ' ';
    String[] parts = StringUtility.split(str1 + delim + str2 + delim + str3, delim);
    Assert.assertEquals(Arrays.asList(str1, str2, str3), Arrays.asList(parts));
}

@Test
public void testStartingWithDelim() {
    final String str1 = "";
    final String str2 = "I'm";
    final String str3 = "Batman";
    final char delim = ' ';
    String[] parts = StringUtility.split(str1 + delim + str2 + delim + str3, delim);
    Assert.assertEquals(Arrays.asList(str1, str2, str3), Arrays.asList(parts));
}

@Test
public void testEmptyString() {
    final String str1 = "";
    final char delim = ' ';
    String[] parts = StringUtility.split(str1, delim);
    Assert.assertEquals(Arrays.asList(str1), Arrays.asList(parts));
}


These new names describe well exactly what they are testing. This should help understanding the code.

The assertions can be simplified, instead of:

String[] parts = StringUtility.split(str1 + delim + str2 + delim + str3, delim);
Assert.assertTrue(parts.length == 3);
Assert.assertTrue(parts[0].equals(str1));
Assert.assertTrue(parts[1].equals(str2));
Assert.assertTrue(parts[2].equals(str3));


You could simplify like this:

String[] parts = StringUtility.split(str1 + delim + str2 + delim + str3, delim);
Assert.assertEquals(Arrays.asList(str1, str2, str3), Arrays.asList(parts));


This is exactly the same, but shorter, and a lot easier to write.

I would add some more test cases:

@Test
public void testOnlyDelim() {
    final String str1 = "";
    final char delim = ' ';
    String[] parts = StringUtility.split("" + delim + delim, delim);
    Assert.assertEquals(Arrays.asList(str1, str1, str1), Arrays.asList(parts));
}

@Test
public void testNotContainingDelim() {
    final String str1 = "hello";
    final char delim = 'x';
    String[] parts = StringUtility.split(str1, delim);
    Assert.assertEquals(Arrays.asList(str1), Arrays.asList(parts));
}


Suggested implementation

@heslacher's implementation is very close to my taste. Here's a variation of that, with some minor changes:

public static String[] split(String strToSplit, char delimiter) {
    List arr = new ArrayList<>();
    int foundPosition;
    int startIndex = 0;
    while ((foundPosition = strToSplit.indexOf(delimiter, startIndex)) > -1) {
        arr.add(strToSplit.substring(startIndex, foundPosition));
        startIndex = foundPosition + 1;
    }
    arr.add(strToSplit.substring(startIndex));
    return arr.toArray(new String[arr.size()]);
}

Code Snippets

ArrayList<String> arr = new ArrayList<String>();
List<String> arr = new ArrayList<String>();
sb.setLength(0);
return arr.toArray(new String[0]);
return arr.toArray(new String[arr.size()]);

Context

StackExchange Code Review Q#59126, answer score: 6

Revisions (0)

No revisions yet.