patternrubyMinor
Prefix Sum in Ruby, Genomic Range Query from Codility
Viewed 0 times
sumgenomicrangequerycodilityrubyprefixfrom
Problem
I'm currently going through some lessons on Codility. I've just spent a couple of hours with GenomicRangeQuery, which is intended to demonstrate the use of prefix sums.
The task description is here. Broadly, the idea is that given a string and a set of arbitrary cut points one should be able to return the lowest value character in a given slice. O(N*M) complexity is what we want to avoid.
My solution scores 100%, but I would still value any feedback on style and readability.
Updated
Here's my current preferred version using tips from some of the answers below. I should probably also use an array of hashes, rather than arrays, for the prefix sums, but I'm lazy.
The task description is here. Broadly, the idea is that given a string and a set of arbitrary cut points one should be able to return the lowest value character in a given slice. O(N*M) complexity is what we want to avoid.
My solution scores 100%, but I would still value any feedback on style and readability.
def solution(str, slice_starts, slice_ends)
# Initiate an array to hold rolling totals of character occurrences
prefix_sums = [0,0,0,0]]
# Map possible characters to their positions in the prefix sum array
chars = { "A" => 0, "C" => 1, "G" => 2, "T" => 3 }
str.split('').each_with_index do |char, i|
prefix_sums[i + 1] = prefix_sums[i].clone
prefix_sums[i + 1][chars[char]] += 1
end
slice_starts.each_with_index.map do |slice_start, i|
s = slice_start
e = slice_ends[i]
chars.each do |char, ii|
occurrences = prefix_sums[e + 1][ii] - prefix_sums[s][ii]
break (ii + 1) if occurrences > 0
end
end
end
Updated
Here's my current preferred version using tips from some of the answers below. I should probably also use an array of hashes, rather than arrays, for the prefix sums, but I'm lazy.
def solution(str, slice_starts, slice_ends)
prefix_sums = [[0] * 4]
chars = { "A" => 0, "C" => 1, "G" => 2, "T" => 3 }
str.chars.each_with_index do |char, i|
prefix_sums[i + 1] = prefix_sums[i].clone
prefix_sums[i + 1][chars[char]] += 1
end
slice_starts.zip(slice_ends).map do |s, e|
chars.each do |char, ii|
occurrences = prefix_sums[e + 1][ii] - prefix_sums[s][ii]
break (ii + 1) if occurrences > 0
end
end
endSolution
Some notes:
I'd write this purely functional solution:
- Use 2-space indentation.
[[0,0,0,0]]. Always a space after a comma.
str.split('')- >str.chars
prefix_sums = [[0,0,0,0]]. This kind of implicit structures hinder readability, I'd simply use a hash. A small space penalty (not in big-Oh, though), better readibility.
each_with_index. Usezipinstead.
- As you say, the problem is about getting partial sums (frequencies). You can implement and use a very generic abstraction (Enumerable#scanl), which can be used in more codility problems.
I'd write this purely functional solution:
def solution(input_str, slice_start, slice_end)
impact_factors = {"A" => 1, "C" => 2, "G" => 3, "T" => 4}
frequencies = input_str.chars.scanl(Hash.new(0)) do |freq, n|
freq.merge(n => freq[n] + 1)
end
slice_start.zip(slice_end).map do |from, to|
difference_count_by_factor = frequencies[to+1].map do |n, count|
[impact_factors[n], count - frequencies[from][nucleotide]]
end.to_h
difference_count_by_factor.reject { |factor, count| count.zero? }.keys.min
end
endCode Snippets
def solution(input_str, slice_start, slice_end)
impact_factors = {"A" => 1, "C" => 2, "G" => 3, "T" => 4}
frequencies = input_str.chars.scanl(Hash.new(0)) do |freq, n|
freq.merge(n => freq[n] + 1)
end
slice_start.zip(slice_end).map do |from, to|
difference_count_by_factor = frequencies[to+1].map do |n, count|
[impact_factors[n], count - frequencies[from][nucleotide]]
end.to_h
difference_count_by_factor.reject { |factor, count| count.zero? }.keys.min
end
endContext
StackExchange Code Review Q#101206, answer score: 2
Revisions (0)
No revisions yet.