patternjavaMinor

String difference algorithm

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

differencestringalgorithm

Problem

I would like to get a feedback about the code below. Is there any way to improve its performance? Maybe you know input values that might print bad output? The idea of the code is to count unique characters from s2 that are not listed in s1.

Ideone.com URL

class Combine {
    public static void main(String[] args) throws IOException {
        BufferedReader bi = new BufferedReader(new InputStreamReader(System.in));
        String s1 = bi.readLine();
        String s2 = bi.readLine();

        String usedCharacters = "";

        for(int i = 0; i < s2.length(); i++) {
            String c = Character.toString(s2.charAt(i));
            if(!usedCharacters.contains(c) && !s1.contains(c)) 
                usedCharacters += c;
        }

        System.out.println(usedCharacters.length());
    }
}

Solution

There is no requirement to report the characters in any particular order. As a result, there's some tricks we can do to improve the performance.

Other tricks to use are:

work using primitive char[] arrays instead of Strings.

use better variable names

Sorting the data is a good first step:

char[] expect = bi.readLine().toCharArray();
    char[] search = bi.readLine().toCharArray();

Now we sort them both:

Arrays.sort(expect);
Arrays.sort(search);

Then, we loop through the search values, and look for characters that do not appear in expect:

StringBuilder usedCharacters = new StringBuilder();
int searchPos = 0;
while (searchPos < search.length) {
    int expectPos = Arrays.binarySearch(expect, search[searchPos]);
    if (expectPos < 0) {
        usedCharacters.append(search[searchPos]);
    }
    // advance to the next character, may be duplicates.
    searchPos++;
    while (searchPos < search.length && search[searchPos - 1] == search[searchPos]) {
        searchPos++;
    }
}

return usedCharacters.toString();

Now, how will this relate in terms of performance?

Your current code loops through each search character, which is an \$O(n)\$ operation. For each character, it then does a search through previously searched characters, and also the unsearched characters. The combination of these two loops leads to a \$O(nm)\$ operation.

In contrast, the sorts on the input data are \$O(n \log{n})\$, and \$O(m \log{m})\$ and the subsequent searches are \$O(n \log{m})\$

The end result will be much more favourable than \$O(nm)\$

Code Snippets

char[] expect = bi.readLine().toCharArray();
    char[] search = bi.readLine().toCharArray();

Arrays.sort(expect);
Arrays.sort(search);

StringBuilder usedCharacters = new StringBuilder();
int searchPos = 0;
while (searchPos < search.length) {
    int expectPos = Arrays.binarySearch(expect, search[searchPos]);
    if (expectPos < 0) {
        usedCharacters.append(search[searchPos]);
    }
    // advance to the next character, may be duplicates.
    searchPos++;
    while (searchPos < search.length && search[searchPos - 1] == search[searchPos]) {
        searchPos++;
    }
}

return usedCharacters.toString();

Context

StackExchange Code Review Q#72029, answer score: 2

Revisions (0)

No revisions yet.