HiveBrain v1.2.0
Get Started
← Back to all entries
patternrubyMinor

CSV module in Ruby

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
csvmoduleruby

Problem

I finished the Ruby chapter in Seven Languages in Seven Weeks. It tries to make you familiar with the core concepts of several languages rather quickly. Dutifully I did all exercises, but most likely they can be improved to be more Ruby-like.


Given: a CSV file structured with a first line with headers and
subsequent rows with data.

one, two
lions, tigers




Create a module which loads the header and values from a CSV file,
based on the name of the implementing class. (RubyCSV ->
"rubycsv.txt") Support an each method which returns a CsvRow
object. Use method_missing to return the value for the column for a
given heading. E.g. usage which will print "lions":

m = RubyCsv.new
m.each { |row| p row.one }


My implementation:

class CsvRow
  attr :row_hash

  def initialize( row_hash )
    @row_hash = row_hash
  end

  def method_missing( name, *args )
    @row_hash[ name.to_s ]
  end
end

module ActsAsCsv
  attr_accessor :headers, :csv_contents

  def self.included( base )
    base.extend ClassMethods
  end

  module ClassMethods
    def acts_as_csv
      include InstanceMethods
    end
  end

  module InstanceMethods
    def read
      @csv_contents = []
      filename = self.class.to_s.downcase + '.txt'
      file = File.new( filename )
      @headers = file.gets.chomp.split( ', ' )

      file.each do |row|
        @csv_contents << row.chomp.split( ', ' )
      end
    end               

    def initialize
      read
    end

    def each      
      @csv_contents.each do |content|        
        hash = {}
        @headers.zip( content ).each { |i| hash[ i[0] ] = i[1] }
        yield CsvRow.new hash
      end
    end
  end
end

class RubyCsv  # No inheritance! You can mix it in.
  include ActsAsCsv
  acts_as_csv
end

Solution

def method_missing( name, *args )
  @row_hash[ name.to_s ]
end


Doing it like this means that if the user calls a row method with arguments, the arguments will silently be ignored. Also if the user calls any method which does not exist and for which no row exists either, he'll just get back nil. I think in both cases an exception should be thrown, so I'd implement method_missing like this:

def method_missing( name, *args )
  if @row_hash.has_key?(name.to_s)
    if args.empty?
      @row_hash[ name.to_s ]
    else
      raise ArgumentError, "wrong number of arguments(#{ args.size } for 0)"
    end
  else
    super
  end
end


module ActsAsCsv
  # ...

  def self.included( base )
    base.extend ClassMethods
  end

  module ClassMethods
    def acts_as_csv
      include InstanceMethods
    end
  end

  module InstanceMethods
    ...
  end
end


This setup seems needlessly complicated. Since all acts_as_csv does is include the instance methods (except headers and csv_contents, which will be present even if acts_as_csv is not called - which seems a bit arbitrary to me) and there is no reason why a user would want to include ActsAsCsv without getting the instance methods, I see no reason for acts_as_csv to exist at all. The instance methods should be directly in the ActsAsCsv module and the ClassMethods and InstanceMethods modules should not exist.

This way the code will be less complex, and you only need one line include ActsAsCsv instead of two to enable the CSV functionality.

def read
  @csv_contents = []
  filename = self.class.to_s.downcase + '.txt'
  file = File.new( filename )
  @headers = file.gets.chomp.split( ', ' )

  file.each do |row|
    @csv_contents << row.chomp.split( ', ' )
  end
end


First of all you're opening a file and never closing it. You should use File.open with a block instead.

Second of all you should make use of higher order functions like map. Using map you can create @csv_contents like this instead of appending to it in an each loop:

def read
  filename = self.class.to_s.downcase + '.txt'
  file = File.open( filename ) do |file|
    @headers = file.gets.chomp.split( ', ' )
    @csv_contents = file.map {|row| row.chomp.split( ', ' )}
  end
end


That being said I don't think it's a good idea to read the whole file into memory in advance (or at all), as that will make your library unusable on big files (which might not even fit into memory).

So I would get rid of the read method and just open and read through the file in the each method, like this:

def each
  filename = self.class.to_s.downcase + '.txt'
  File.open(filename) do |file|
    headers = file.gets.chomp.split( ', ' )
    file.each do |content|        
      hash = {}
      headers.zip( content.chomp.split(', ') ).each { |i| hash[ i[0] ] = i[1] }
      yield CsvRow.new hash
    end
  end
end


Lastly, instead of

hash = {}
headers.zip( content ).each { |i| hash[ i[0] ] = i[1] }


You can also write hash = Hash[ headers.zip( content ) ].

On a more general note, your code does not really correctly parse CSV files. Your code assumes that the fields will be separated by a comma followed by a single space. However it is not required that commas are actually followed by any spaces (and the RFC actually says that any space after a comma is part of the field's content and should not be ignored). You're also not handling quoting (e.g. foo, "bar, baz", bay, which is a row containing three, not four, fields) at all.

Code Snippets

def method_missing( name, *args )
  @row_hash[ name.to_s ]
end
def method_missing( name, *args )
  if @row_hash.has_key?(name.to_s)
    if args.empty?
      @row_hash[ name.to_s ]
    else
      raise ArgumentError, "wrong number of arguments(#{ args.size } for 0)"
    end
  else
    super
  end
end
module ActsAsCsv
  # ...

  def self.included( base )
    base.extend ClassMethods
  end

  module ClassMethods
    def acts_as_csv
      include InstanceMethods
    end
  end

  module InstanceMethods
    ...
  end
end
def read
  @csv_contents = []
  filename = self.class.to_s.downcase + '.txt'
  file = File.new( filename )
  @headers = file.gets.chomp.split( ', ' )

  file.each do |row|
    @csv_contents << row.chomp.split( ', ' )
  end
end
def read
  filename = self.class.to_s.downcase + '.txt'
  file = File.open( filename ) do |file|
    @headers = file.gets.chomp.split( ', ' )
    @csv_contents = file.map {|row| row.chomp.split( ', ' )}
  end
end

Context

StackExchange Code Review Q#2144, answer score: 4

Revisions (0)

No revisions yet.