HiveBrain v1.2.0
Get Started
← Back to all entries
patternrubyMinor

Optimize a zscore algorithm

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
zscorealgorithmoptimize

Problem

I wrote a zscore algorithm in Ruby that runs fine, but is incredibly slow when I have 8000+ entries to score. Can anyone help me figure out a way to improve my code, please?

module Enumerable  
  def mean
    reduce(:+).to_f / length
  end

  def sample_variance
    sum = inject(0){ |acc, i| acc + (i - mean)**2 }
    1 / length.to_f * sum
  end

  def standard_deviation
    Math.sqrt(sample_variance)
  end

  def zscore
    if standard_deviation.zero?
      Array.new(length, 0)
    else
      collect { |v| (v - mean) / standard_deviation } 
    end
  end
end


The float is giving every score an accuracy of up to 17 decimal places. Would making it only 8 decimal places speed things up?

EDIT:
Here is an updated version of the algorithm given the advice received in this thread.

class Array
  def mean(len=self.length)
    reduce(:+).to_f / len
  end

  def sample_variance
    len = length
    m = mean(len)
    sum = reduce { |acc, i| acc + (i - m)**2 }
    sum.to_f / len
  end

  def standard_deviation
    Math.sqrt(sample_variance)
  end

  def zscore
    stdev = standard_deviation
    m = mean
    stdev.zero? ? Array.new(length, 0) : collect { |v| (v - m) / stdev }
  end
end

Solution

The problem is here: collect { |v| (v - mean) / standard_deviation }. standard_deviation is constant but, being inside a block, it is called on each iteration. Set the value to a local variable before. As noted by Flambino, the same principle applies to sample_variance (which uses mean inside a block).

In a functional language (where immutability is honored) the compiler would be able to do the right thing, but not in an imperative language plagued with side-effects like Ruby.

Some additional notes to your code:

-
module Enumerable: But you call .length, which is not a method that an enumerable is required to implement. Consider adding them to Array (which includes Enumerable).

-
reduce and then inject. I'd use just one of the alias.

Context

StackExchange Code Review Q#51087, answer score: 4

Revisions (0)

No revisions yet.