patternjavaMinor
Nucleotide Count
Viewed 0 times
countnucleotidestackoverflow
Problem
I am learning Java and hence practicing it from here. The solution seems pretty trivial to me but still I would like to have a honest review which would guide me to following ideas:
Test suites:
```
import static org.assertj.core.api.Assertions.assertThat;
import static org.assertj.core.api.Assertions.entry;
import org.junit.Test;
public class NucleotideTest {
@Test
public void testEmptyDnaStringHasNoAdenosine() {
DNA dna = new DNA("");
assertThat(dna.count('A')).isEqualTo(0);
}
@Test
public void testEmptyDnaStringHasNoNucleotides() {
DNA dna = new DNA("");
assertThat(dna.nucleotideCounts()).hasSize(4).contains(
entry('A'
- Am I maintaining class Invariants?
- Do I need to have more Domain specific error messages?
- Although the solution seems trivial but I am missing some performance boost?
- Am I following proper encapsulation?
- Using correct data structures?
- Any other suggestion are most welcome.
import java.util.Map;
import java.util.HashMap;
import java.util.Collections;
public class DNA {
private final Map nucleotideMap = new HashMap<>();
private final String dnaSequence;
public DNA(String dnaSequence) {
this.dnaSequence = dnaSequence;
initializeDefaultNucleotideMap();
if (!dnaSequence.isEmpty()) {
countNucleotides();
}
}
private Map initializeDefaultNucleotideMap() {
nucleotideMap.put('A', 0);
nucleotideMap.put('C', 0);
nucleotideMap.put('G', 0);
nucleotideMap.put('T', 0);
return nucleotideMap;
}
private void countNucleotides() {
for (Character chr : dnaSequence.toCharArray()) {
nucleotideMap.put(chr, nucleotideMap.get(chr) + 1);
}
}
public int count(char nucleotide) {
if (!nucleotideMap.containsKey(nucleotide)) {
throw new IllegalArgumentException("Invalid nucleotide");
}
return nucleotideMap.get(nucleotide);
}
public Map nucleotideCounts() {
return Collections.unmodifiableMap(nucleotideMap);
}
}Test suites:
```
import static org.assertj.core.api.Assertions.assertThat;
import static org.assertj.core.api.Assertions.entry;
import org.junit.Test;
public class NucleotideTest {
@Test
public void testEmptyDnaStringHasNoAdenosine() {
DNA dna = new DNA("");
assertThat(dna.count('A')).isEqualTo(0);
}
@Test
public void testEmptyDnaStringHasNoNucleotides() {
DNA dna = new DNA("");
assertThat(dna.nucleotideCounts()).hasSize(4).contains(
entry('A'
Solution
Am I maintaining class Invariants?
Yes. But I wonder if you really need to expose
Do I need to have more Domain specific error messages?
I don't see an obvious reason to do that. The
Although the solution seems trivial but I am missing some performance boost?
Since the number of nucleotides is unlikely to change, using a
Am I following proper encapsulation?
As mentioned earlier, it would be better to hide the implementation detail of the storage of the nucleotide counts, that is, remove the
Using correct data structures?
Yes.
Any other suggestion are most welcome.
Some additional tips and observations:
-
No need to keep
-
In the constructor, it's unnecessary to check if
-
You'll get an NPE on invalid nucleotides, for example
Yes. But I wonder if you really need to expose
nucleotideCounts. You did it safely, by wrapping the map in unmodifiableMap (and because the values are immutable too). To eliminate the risk of abuses and errors, it would be better to keep this implementation detail hidden.Do I need to have more Domain specific error messages?
I don't see an obvious reason to do that. The
IllegalArgumentException seems appropriate when trying to get an invalid nucleotide.Although the solution seems trivial but I am missing some performance boost?
Since the number of nucleotides is unlikely to change, using a
Map for the storage seems a bit overkill. A simple array of 4 elements would be more light-weight. That being said, premature optimization is considered evil, so I wouldn't worry about this too much until the current implementation is proven to be a bottleneck.Am I following proper encapsulation?
As mentioned earlier, it would be better to hide the implementation detail of the storage of the nucleotide counts, that is, remove the
nucleotideCounts method. But it depends on your use case. If you really need that method, then you cannot remove it. In any case, it's important to ask that question: should this be hidden?Using correct data structures?
Yes.
Any other suggestion are most welcome.
Some additional tips and observations:
-
No need to keep
dnaSequence in a field. Once you built the map of counts from it, you no longer need it. I suggest to get rid of it.-
In the constructor, it's unnecessary to check if
dnaSequence is empty. The iteration logic in countNucleotides naturally embeds that check.-
You'll get an NPE on invalid nucleotides, for example
new DNA("hello")Context
StackExchange Code Review Q#126583, answer score: 5
Revisions (0)
No revisions yet.