HiveBrain v1.2.0
Get Started
← Back to all entries
patternrubyrailsMinor

Probability ordering by seller's tariff

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
orderingsellerprobabilitytariff

Problem

I have an instance method which calculates the position of an element, depending on the tariff of its seller.

If the seller is of a level3 tariff, it gets to top 100, the probability of 40%. If the tariff is level2, then 30%, etc.

TARIFFS = {level3: 0...40,
           level2: 40...70,
           level1: 70...90,
           level0: 90..100}

def update_priority
  top = 100
  priority = 0
  random = rand(0..top)
  all_count = collection.find.count
  if TARIFFS[seller.tariff] === random
    priority = rand((all_count - top).to_f..all_count.to_f)
  else
    priority = rand(1..all_count)
  end

  collection.find(_id: self._id).update('$set' => {priority: priority})
end


And I have class method which runs every 2 hours:

def recalculate_priorities
  unscoped.each(&:update_priority)
end


It does not work fast. How can I optimize it to work faster?

Solution

Note: I'm assuming collection and unscoped are some sort of Enumerable objects.

The key to improving performance is to not do any unnecessary work. You're doing quite a bit of unnecessary work in #update_priority.

all_count = collection.find.count creates an Enumerator that isn't used. Let's get rid of that.

all_count = collection.find.count calculates the size is of that enumerator inefficiently. #count iterates the collection and counts how many elements it has. It is an O(n) operation. #size on the other hand, doesn't iterate the collection. It is an O(1) operation.

If this only happened once, this wouldn't be so bad, but this happens for every element of unscoped. Suppose unscoped has 100 elements and collection has 1000 elements. We're doing the same calculation a 100 times. That's a 100,000 unnecessary iterations! We can do better: only calculate the size once in #recalculate_priorities.

collection.find(_id: self._id).update('$set' => {priority: priority}) suffers from a similar problem. For each unscoped element at least 1 database query is performed. Again, we can do better: use a single query to update many records at once (a batch query) in #recalculate_priorities.

I've written some example code below. As I don't know any MongoDB, I made up an API for it.

def new_priority(collection_size)
  top      = 100
  priority = 0
  random   = rand(top)

  if TARIFFS[seller.tariff].cover?(random)
    lower = collection_size - top
    upper = collection_size

    return rand(lower..upper) # Floats are unnecessary
  else
    return rand(1..collection_size)
  end
end

def recalculate_priorities
  new_priorities = unscoped.map { |el| el.new_priority(collection.size) }

  unscoped.update(:priority => new_priorities)
end

Code Snippets

def new_priority(collection_size)
  top      = 100
  priority = 0
  random   = rand(top)

  if TARIFFS[seller.tariff].cover?(random)
    lower = collection_size - top
    upper = collection_size

    return rand(lower..upper) # Floats are unnecessary
  else
    return rand(1..collection_size)
  end
end

def recalculate_priorities
  new_priorities = unscoped.map { |el| el.new_priority(collection.size) }

  unscoped.update(:priority => new_priorities)
end

Context

StackExchange Code Review Q#79742, answer score: 2

Revisions (0)

No revisions yet.