HiveBrain v1.2.0
Get Started
← Back to all entries
patternrubyMinor

Hacking Ruby Hash: Fastest #to_struct method

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
methodhashto_structfastestrubyhacking

Problem

I am trying to make the fastest #to_struct method in Ruby's Hash.

I am including a use case and benchmark so you can run and see if you have really improved the code.

This is my implementation and the benchmark is included. The time at the bottom is the time it takes on my machine. How can I make this faster?

`require "json"
require 'benchmark'
require 'bigdecimal/math'

class Hash
def to_struct
k = self.keys
klass = k.map(&:to_s).sort_by {|word| word.downcase}.join.capitalize
begin
Kernel.const_get("Struct::" + klass).new(self.values_at(k))
rescue NameError
Struct.new(klass, (k)).new(self.values_at(*k))
end
end
end

# You have a hash that you have built in your app
sample_hash = {
foo_key: "foo_val",
bar_key: "bar_val",
baz_key: "baz_val",
foo1_key: "foo_val",
bar1_key: "bar_val",
baz1_key: "baz_val",
foo2_key: "foo_val",
bar2_key: "bar_val",
baz2_key: "baz_val",
foo3_key: "foo_val",
bar3_key: "bar_val",
baz3_key: "baz_val",
foo4_key: "foo_val",
bar4_key: "bar_val",
baz4_key: "baz_val",
foo5_key: "foo_val",
bar5_key: "bar_val",
baz5_key: "baz_val",
foo6_key: "foo_val",
bar6_key: "bar_val",
baz6_key: "baz_val",
foo7_key: "foo_val",
bar7_key: "bar_val",
baz7_key: "baz_val",
}

# Then you have JSON coming from some external api
json_response = "{\"qux_key\":\"qux_val\",\"quux_key\":\"quux_val\",\"corge_key\":\"corge_val\"}"
hash_with_unknown_keys = JSON.parse(json_response)

# Merge these two together
sample_hash.merge!(hash_with_unknown_keys)

iterations = 100_000

Benchmark.bm do |bm|
bm.report "#to_struct" do
iterations.times do
# Would be super nice if I could convert this to a struct with a method
# Somehow a bit faster than the explicit example below and much faster than open struct
sample_struct = sample_hash.to_struct
unless sample_struct.foo_key == "foo_val"
raise "Wrong value"
end
end
end

bm.report "Struct" do

Solution

Use OpenHash and Ruby >= 2.3.0

Starting with MRI 2.3.0, your benchmark using OpenHash gets fast. Very fast:

ruby-2.2.5: ruby 2.2.5p319 (2016-04-26 revision 54774) [x86_64-linux]
       user     system      total        real
#to_struct  1.780000   0.000000   1.780000 (  1.774490)
Struct  9.100000   0.000000   9.100000 (  9.099619)
OpenStruct  7.910000   0.000000   7.910000 (  7.911342)

ruby-2.3.0: ruby 2.3.0p0 (2015-12-25 revision 53290) [x86_64-linux]
       user     system      total        real
#to_struct  1.700000   0.000000   1.700000 (  1.695587)
Struct  7.660000   0.000000   7.660000 (  7.660869)
OpenStruct  0.650000   0.000000   0.650000 (  0.658817)


With the latest MRI, Your #to_struct method gets a bit of a speed boost as well.

ruby-2.4.1: ruby 2.4.1p111 (2017-03-22 revision 58053) [x86_64-linux]
       user     system      total        real
#to_struct  1.460000   0.000000   1.460000 (  1.459063)
Struct  7.420000   0.000000   7.420000 (  7.416505)
OpenStruct  0.660000   0.000000   0.660000 (  0.658009)


So if you can, use Ruby >= ruby 2.3.0, and use OpenHash.

How to make #to_struct faster

I made the following changes for performance:

  • Eliminate the mapping of hash keys using #downcase.



  • Use #values instead of #values_at (values are always the same order as keys). See https://stackoverflow.com/a/31425274/238886



and these for clarity:

  • Eliminate the temporary for self.keys



  • DRY the creation of the struct instance



  • Removed self references.



With these changes, the code is:

class Hash
  def new_to_struct
    klass_name = keys.map(&:to_s).sort.join.capitalize
    klass = begin
              Kernel.const_get("Struct::" + klass_name)
            rescue NameError
              Struct.new(klass_name, *keys)
            end
    klass.new(*values)
  end
end


and the benchmark (run against ruby-2.4.1):

user     system      total        real
#to_struct  1.410000   0.000000   1.410000 (  1.403908)
#new_to_struct  0.760000   0.000000   0.760000 (  0.757548)
Struct  7.060000   0.010000   7.070000 (  7.075619)
OpenStruct  0.650000   0.000000   0.650000 (  0.649057)


These changes get to_struct close to OpenStruct, but still not as fast.

Code Snippets

ruby-2.2.5: ruby 2.2.5p319 (2016-04-26 revision 54774) [x86_64-linux]
       user     system      total        real
#to_struct  1.780000   0.000000   1.780000 (  1.774490)
Struct  9.100000   0.000000   9.100000 (  9.099619)
OpenStruct  7.910000   0.000000   7.910000 (  7.911342)

ruby-2.3.0: ruby 2.3.0p0 (2015-12-25 revision 53290) [x86_64-linux]
       user     system      total        real
#to_struct  1.700000   0.000000   1.700000 (  1.695587)
Struct  7.660000   0.000000   7.660000 (  7.660869)
OpenStruct  0.650000   0.000000   0.650000 (  0.658817)
ruby-2.4.1: ruby 2.4.1p111 (2017-03-22 revision 58053) [x86_64-linux]
       user     system      total        real
#to_struct  1.460000   0.000000   1.460000 (  1.459063)
Struct  7.420000   0.000000   7.420000 (  7.416505)
OpenStruct  0.660000   0.000000   0.660000 (  0.658009)
class Hash
  def new_to_struct
    klass_name = keys.map(&:to_s).sort.join.capitalize
    klass = begin
              Kernel.const_get("Struct::" + klass_name)
            rescue NameError
              Struct.new(klass_name, *keys)
            end
    klass.new(*values)
  end
end
user     system      total        real
#to_struct  1.410000   0.000000   1.410000 (  1.403908)
#new_to_struct  0.760000   0.000000   0.760000 (  0.757548)
Struct  7.060000   0.010000   7.070000 (  7.075619)
OpenStruct  0.650000   0.000000   0.650000 (  0.649057)

Context

StackExchange Code Review Q#51513, answer score: 2

Revisions (0)

No revisions yet.