HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Count unique characters in a String and group the counts

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
uniquethecountsgroupandcharacterscountstring

Problem

The code counts the amount of unique characters that fit in the character set ACGT (stemming from nucleobases).

We take a string as input that must contain only either 'A', 'C', 'G', or 'T'.

For instance, "AACCC" should return the Map that looks like this:

Map('A' -> 2, 'C' -> 3, 'G' -> 0, 'T' -> 0).

More detail can be found in a gist with specs.

class DNA(code: String) {
  val nucleotides = Map('A' -> 0, 'T' -> 0, 'C' -> 0, 'G' -> 0)

  code foreach valid_nucleotide

  def nucleotideCounts(): Map[Char, Int] = {
    nucleotides ++ code.filter(nucleotides.contains(_)).groupBy(identity).mapValues(_.length)
  }

  def count(nucleotide: Char): Int = {
    valid_nucleotide(nucleotide)
    nucleotideCounts()(nucleotide)
  }

  private def valid_nucleotide(char: Char) = {
    if (!nucleotides.contains(char)) {
      throw new IllegalArgumentException("DNA is wrong");
    }
  }

}

Solution

For the most part, there isn't much that I would change - well done!

First, minor things:

-
Remove the semi-colon in valid_nucleotide.

-
Next, you can change nucleotides.contains(_) (an anonymous function) to just nucleotides.contains (a method value).

-
The reuse of valid_nucleotide in count will result in any call with an invalid parameter getting an IllegalArgumentException("DNA is wrong"). I think this error would be confusing in this context.

The biggest single change I would make is to precompute the nucleotide counts. Currently, the code recomputes them each and every time either count or nucleotideCounts is called. Since code is immutable, it is only necessary to compute them once.

In a Java program, the "empty" nucleotides map could be made into a static member (rather than having a copy of it in every instance of the DNA class). The equivalent approach in Scala is to use a companion object (an object with the same class name as the class it is a companion to).

My version of your code looked like this:

object DNA {
  val nucleotides = Map('A' -> 0, 'T' -> 0, 'C' -> 0, 'G' -> 0)
}

class DNA(code: String) {

  code foreach validNucleotide

  val counts = countNucleotides()

  def nucleotideCounts(): Map[Char, Int] = counts

  def count(nucleotide: Char): Int = {
    validNucleotide(nucleotide)
    nucleotideCounts()(nucleotide)
  }

  private def countNucleotides(): Map[Char, Int] = {
    DNA.nucleotides ++ code.filter(DNA.nucleotides.contains).groupBy(identity).mapValues(_.length)
  }

  private def validNucleotide(char: Char) = {
    if (!DNA.nucleotides.contains(char)) {
      throw new IllegalArgumentException("Character not found in valid nucleotide set")
    }
  }
}

Code Snippets

object DNA {
  val nucleotides = Map('A' -> 0, 'T' -> 0, 'C' -> 0, 'G' -> 0)
}

class DNA(code: String) {

  code foreach validNucleotide

  val counts = countNucleotides()

  def nucleotideCounts(): Map[Char, Int] = counts

  def count(nucleotide: Char): Int = {
    validNucleotide(nucleotide)
    nucleotideCounts()(nucleotide)
  }

  private def countNucleotides(): Map[Char, Int] = {
    DNA.nucleotides ++ code.filter(DNA.nucleotides.contains).groupBy(identity).mapValues(_.length)
  }

  private def validNucleotide(char: Char) = {
    if (!DNA.nucleotides.contains(char)) {
      throw new IllegalArgumentException("Character not found in valid nucleotide set")
    }
  }
}

Context

StackExchange Code Review Q#95425, answer score: 3

Revisions (0)

No revisions yet.