patternjavaMinor
Counting the number of occurrences of chars in one string, in another string
Viewed 0 times
numberthecountingoccurrencesoneanothercharsstring
Problem
This program counts the total number of times any character from the source string can be found in the target string.
E.g Source String -
Target String -
Output =
Source String -
Target String -
Output =
Is there any better way of write the below program?
E.g Source String -
"Hello World"Target String -
"llo"Output =
5 (as there are 3 l's and 2 o's)Source String -
"Hello World"Target String -
"ooood"Output =
3 (as there are 2 o's and 1 d)Is there any better way of write the below program?
public class CountOccurence {
public static int countChars(String target, String source){
char a[] = source.toCharArray();
int loc = 0, count = 0;
boolean charRepeat = false;
for (loc = 0; loc < source.length(); loc++) {
charRepeat = false;
for (int z = 0; z < loc; z++) {
if (a[loc] == a[z]) {
charRepeat = true;
break;
}
}
if (charRepeat == false) {
int i = 0;
while ((i = target.indexOf(a[loc], i)) != -1) {
count++;
i++;
}
}
}
return count;
}
public static void main(String[] args) {
String target = "Hello World";
String source = "oood";
int count = countChars(target, source);
System.out.println(count);
source = "lld";
count = countChars(target, source);
System.out.println(count);
}
}Solution
Style
There are a few bad habits in here that should be resolved. Here's your core method:
let's go through some things that are style-based.
-
source comes before target. The natural thinking is that you go from the source to the target, your naming, and the order of the parameters, is awkward. I would call the
but in your code you have:
Note how you have swapped the source/target between the example and the code.
-
Your code uses a number of small variables that are unconventional.
Algorithm
Your algorithm is a bit messed up too. Basically, you scan the
Then, for each unique char, you then do a loop through all the
The net result is that you have a performance of \$O(nm)\$ or \$O(n^2)\$ depending on whether there are more characters in the
There is a better solution for both the uniqueness testing, and for the searching. Consider sorting the chars in each string:
The two sorts are \$O(n \log n)\$ performance, so this is, so far, significantly less complicated... now, how do we compare? Well, with a 'simple' loop over all the lookIn characters:
With the above code, you do two sorts, and 1 loop through each value. The net result is that your performance complexity is simply \$O(n \log n)\$ where \$n\$ is the longer of the
By sorting each side, you give yourself a significant algorithmic advantage.
There are a few bad habits in here that should be resolved. Here's your core method:
public static int countChars(String target, String source){
char a[] = source.toCharArray();
int loc = 0, count = 0;
boolean charRepeat = false;
for (loc = 0; loc < source.length(); loc++) {
charRepeat = false;
for (int z = 0; z < loc; z++) {
if (a[loc] == a[z]) {
charRepeat = true;
break;
}
}
if (charRepeat == false) {
int i = 0;
while ((i = target.indexOf(a[loc], i)) != -1) {
count++;
i++;
}
}
}
return count;
}let's go through some things that are style-based.
-
source comes before target. The natural thinking is that you go from the source to the target, your naming, and the order of the parameters, is awkward. I would call the
source something like searchFor, and the target I would call searchIn. Note, you have confused these values so much that your description does not match your code. In your description you have: E.g Source String - "Hello World"
Target String - "ooood"
Output = 3but in your code you have:
String target = "Hello World";
String source = "oood";
int count = countChars(target, source);Note how you have swapped the source/target between the example and the code.
source and target are basically bad names to have...-
Your code uses a number of small variables that are unconventional.
loc is not a terrible name, it is clearly short for location, but using the fill location would be better. I have a habit of using pos so I can't really complain. On the other hand, the variables a for the char array, and z for an index, are bad names. You already use charRepeat, so you know how to use meaningful names, use them. The z is interesting because it is common to use x, y, and z as names for coordinate space, so seeing z automatically implies there's an x and y too. You should rather just use the standard index variables i, j, and k (but don't use j unless you are already using i, and don't use k unless you are using j too).Algorithm
Your algorithm is a bit messed up too. Basically, you scan the
searchFor chars, and find the first occurrence of each char value. If it's the first one, you then use that to scan the searchIn value, and you count the 'hits' in there. There are two issues here. The first issue is the performance of the 'first check'. The performance is \$O(n^2)\$ where \$n\$ is the number of chars in the searchFor value.Then, for each unique char, you then do a loop through all the
searchIn chars. The loop through all those are disguised in two steps, first the while loop, and then inside that, the indexOf search.The net result is that you have a performance of \$O(nm)\$ or \$O(n^2)\$ depending on whether there are more characters in the
searchIn or searchFor strings.There is a better solution for both the uniqueness testing, and for the searching. Consider sorting the chars in each string:
char[] lookFor = searchFor.toCharArray();
char[] lookIn = searchIn.toCharArray();
Arrays.sort(lookFor);
Arrays.sort(lookIn);The two sorts are \$O(n \log n)\$ performance, so this is, so far, significantly less complicated... now, how do we compare? Well, with a 'simple' loop over all the lookIn characters:
int forpos = 0;
int count = 0;
for (char c : lookIn) {
// advance the searchFor cursor to the next usable position
while (forpos = lookFor.length) {
//nothing left to look for.
break;
}
if (c == lookFor[forpos]) {
count++;
}
}
return count;With the above code, you do two sorts, and 1 loop through each value. The net result is that your performance complexity is simply \$O(n \log n)\$ where \$n\$ is the longer of the
searchIn or searchFor variables.By sorting each side, you give yourself a significant algorithmic advantage.
Code Snippets
public static int countChars(String target, String source){
char a[] = source.toCharArray();
int loc = 0, count = 0;
boolean charRepeat = false;
for (loc = 0; loc < source.length(); loc++) {
charRepeat = false;
for (int z = 0; z < loc; z++) {
if (a[loc] == a[z]) {
charRepeat = true;
break;
}
}
if (charRepeat == false) {
int i = 0;
while ((i = target.indexOf(a[loc], i)) != -1) {
count++;
i++;
}
}
}
return count;
}E.g Source String - "Hello World"
Target String - "ooood"
Output = 3String target = "Hello World";
String source = "oood";
int count = countChars(target, source);char[] lookFor = searchFor.toCharArray();
char[] lookIn = searchIn.toCharArray();
Arrays.sort(lookFor);
Arrays.sort(lookIn);int forpos = 0;
int count = 0;
for (char c : lookIn) {
// advance the searchFor cursor to the next usable position
while (forpos < lookFor.length && lookFor[forpos] < c) {
forpos++;
}
if (forpos >= lookFor.length) {
//nothing left to look for.
break;
}
if (c == lookFor[forpos]) {
count++;
}
}
return count;Context
StackExchange Code Review Q#68746, answer score: 7
Revisions (0)
No revisions yet.