patternrubyMinor

Prefix Sum in Ruby, Genomic Range Query from Codility

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

sumgenomicrangequerycodilityrubyprefixfrom

Problem

I'm currently going through some lessons on Codility. I've just spent a couple of hours with GenomicRangeQuery, which is intended to demonstrate the use of prefix sums.

The task description is here. Broadly, the idea is that given a string and a set of arbitrary cut points one should be able to return the lowest value character in a given slice. O(N*M) complexity is what we want to avoid.

My solution scores 100%, but I would still value any feedback on style and readability.

def solution(str, slice_starts, slice_ends)
    # Initiate an array to hold rolling totals of character occurrences
    prefix_sums = [0,0,0,0]]
    # Map possible characters to their positions in the prefix sum array
    chars = { "A" => 0, "C" => 1, "G" => 2, "T" => 3 }

    str.split('').each_with_index do |char, i|
        prefix_sums[i + 1] = prefix_sums[i].clone
        prefix_sums[i + 1][chars[char]] += 1
    end

    slice_starts.each_with_index.map do |slice_start, i|
        s = slice_start
        e = slice_ends[i]

        chars.each do |char, ii|
            occurrences = prefix_sums[e + 1][ii] - prefix_sums[s][ii]
            break (ii + 1) if occurrences > 0
        end
    end
end

Updated

Here's my current preferred version using tips from some of the answers below. I should probably also use an array of hashes, rather than arrays, for the prefix sums, but I'm lazy.

def solution(str, slice_starts, slice_ends)
  prefix_sums = [[0] * 4]
  chars = { "A" => 0, "C" => 1, "G" => 2, "T" => 3 }

  str.chars.each_with_index do |char, i|
      prefix_sums[i + 1] = prefix_sums[i].clone
      prefix_sums[i + 1][chars[char]] += 1
  end

  slice_starts.zip(slice_ends).map do |s, e|
    chars.each do |char, ii|
      occurrences = prefix_sums[e + 1][ii] - prefix_sums[s][ii]
      break (ii + 1) if occurrences > 0
    end
  end
end

Solution

Some notes:

Use 2-space indentation.

[[0,0,0,0]]. Always a space after a comma.

str.split('') - >str.chars

prefix_sums = [[0,0,0,0]]. This kind of implicit structures hinder readability, I'd simply use a hash. A small space penalty (not in big-Oh, though), better readibility.

each_with_index. Use zip instead.

As you say, the problem is about getting partial sums (frequencies). You can implement and use a very generic abstraction (Enumerable#scanl), which can be used in more codility problems.

I'd write this purely functional solution:

def solution(input_str, slice_start, slice_end)
  impact_factors = {"A" => 1, "C" => 2, "G" => 3, "T" => 4}
  frequencies = input_str.chars.scanl(Hash.new(0)) do |freq, n|
    freq.merge(n => freq[n] + 1)
  end

  slice_start.zip(slice_end).map do |from, to|
    difference_count_by_factor = frequencies[to+1].map do |n, count| 
      [impact_factors[n], count - frequencies[from][nucleotide]]
    end.to_h
    difference_count_by_factor.reject { |factor, count| count.zero? }.keys.min
  end
end

Code Snippets

def solution(input_str, slice_start, slice_end)
  impact_factors = {"A" => 1, "C" => 2, "G" => 3, "T" => 4}
  frequencies = input_str.chars.scanl(Hash.new(0)) do |freq, n|
    freq.merge(n => freq[n] + 1)
  end

  slice_start.zip(slice_end).map do |from, to|
    difference_count_by_factor = frequencies[to+1].map do |n, count| 
      [impact_factors[n], count - frequencies[from][nucleotide]]
    end.to_h
    difference_count_by_factor.reject { |factor, count| count.zero? }.keys.min
  end
end

Context

StackExchange Code Review Q#101206, answer score: 2

Revisions (0)

No revisions yet.