HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavaMinor

Shady Characters

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
charactersshadystackoverflow

Problem

As part of my familiarization with the features of Java 8, and inspired by this question, I thought I would take the 'Shady Character' problem to 'the next level'.

The basic problem is that you have a 'source' string, and a target string. The objective is to count the number of characters in the target that also appear in the source.

I decided to clarify the problem in 1 way, and extend it in another:

  • If the source contains duplicate characters, they should only count once in the target.



  • The system should be able to count characters concurrently from multiple targets, and, additionally, should fully utilize as much processing capacity as possible, hopefully very efficiently.



Given that the objective was to use Java8 streams too, I specifically targeted the IntStream as an efficient way to handle the characters.

```
import java.util.Arrays;
import java.util.Collection;
import java.util.concurrent.atomic.AtomicLong;
import java.util.stream.IntStream;

public class ShadyCharacters {

private final boolean[] countIt;
private final int offset;
private final AtomicLong foundCount = new AtomicLong();

public ShadyCharacters(final String source) {
int min = Character.MAX_VALUE;
int max = Character.MIN_VALUE;
char[] chars = source.toCharArray();
for (char c : chars) {
min = Math.min(min, c);
max = Math.max(max, c);
}

if (chars.length == 0) {
offset = 0;
countIt = new boolean[0];
} else {
offset = min;
countIt = new boolean[max - min + 1];
for (char c : chars) {
countIt[c - offset] = true;
}
}
}

public long getCount() {
return foundCount.get();
}

public long countShadows(final String target) {
final long count = IntStream.range(0, target.length()).parallel()
.map(charIndex -> target.charAt(charIndex) - offset)
.filter

Solution

I guess, I wouldn't bother with min as some "small" characters are always present. You're trying to conserve memory, but with source being \u0000\uFFFF you'll need 64 kB, anyway. No big deal and no nice and efficient way to avoid it.

Except for that you could reduce the memory consumption by a factor of 8 without much effort (but I don't tell you how :D:D:D). It'd surely reduce the speed a bit, so feel free top forget it.

private final AtomicLong foundCount = new AtomicLong();


You're aware that calling countShadows the second time return the sum of the first and second run? I found it confusing.

public long countShadows(final String target) {...}
public void countShadows(final Collection targets) {...}


As both methods do about the same, they should both return long (or both void). I'd go for long and keep the class immutable.

I guess you need no AtomicLong at all. The streams are surely able to compute the sum themselves in a thread-safe way.

Code Snippets

private final AtomicLong foundCount = new AtomicLong();
public long countShadows(final String target) {...}
public void countShadows(final Collection<String> targets) {...}

Context

StackExchange Code Review Q#63100, answer score: 4

Revisions (0)

No revisions yet.