HiveBrain v1.2.0
Get Started
← Back to all entries
patternrubyMinor

CSV File Parser in Ruby

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
parsercsvfileruby

Problem

The following is my Ruby attempt at a (very) basic CSV file parser class, inspired by an exercise from the book Seven Languages in Seven Weeks. I'm a Ruby novice and will be grateful for any suggestions for improvement.

#!/usr/local/bin/ruby -w

# CSV file parser.
#
# USAGE:
# CsvFile.new('data.csv').each { |row| puts row.firstname }
#
class CsvFile

  # Parses the given CSV file into a collection of rows.
  #
  def initialize file
    @rows = []
    headers = file.gets.chomp.split(", ")
    file.each do |line|
      values = {}
      headers.zip(line.chomp.split(", ")).each do |key, value|
        values[key] = value
      end
      @rows  a hash containing the column -> value mapping
  #
  def initialize values
    @values = values
  end

  # Returns the value in the column given as method name, or null if
  # no such value exists.
  #
  def method_missing name, *args  
    @values[name.to_s]
  end
end

# TEST CASES
#

class AssertionError < RuntimeError
end

def assert &block
    raise AssertionError unless yield
end

require 'stringio'

file = StringIO.new(
"firstname, lastname, age, sex
Andrej, Beles, 25, male
Delia, Marin, 20, female
Henry, Prahanth, 33, male"
)

csvFile = CsvFile.new(file)

row = csvFile[0]
assert { row.firstname == "Andrej" }
assert { row.lastname == "Beles" }
assert { row.age == "25" }
assert { row.sex == "male" }

row = csvFile[1]
assert { row.firstname == "Delia" }
assert { row.lastname == "Marin" }
assert { row.age == "20" }
assert { row.sex == "female" }

row = csvFile[2]
assert { row.firstname == "Henry" }
assert { row.lastname == "Prahanth" }
assert { row.age == "33" }
assert { row.sex == "male" }

puts "DONE."

Solution

You've got a bug here.

headers = file.gets.chomp.split(", ")


There is no requirement in a *.csv file that there be a space after the comma between items. In fact, if there is a space, that space should be considered part of the item, not part of the delimiter.

The next thing I notice is that comma separated files are not strictly separated by commas. Other delimiters are possible, and likely to be come across. Things such as pipes | are common. I would consider supporting other delimiters. For example, Excel supports Tabs, Semicolons, Commas, and Spaces, along with an option for a custom defined delimiter. You might not want to fuss with supporting user defined delimiters, but certainly your class becomes more useful if you support the ones I've mentioned.

The process of implementing this should clean up the string literal duplication you have here.

def initialize file
    @rows = []
    headers = file.gets.chomp.split(", ")
    file.each do |line|
      values = {}
      headers.zip(line.chomp.split(", ")).each do |key, value|
        values[key] = value
      end
      @rows << CsvRow.new(values)
    end
  end


But minimally, you should replace ", " with a constant value so you never accidentally change the delimiter in one place, but not the other.

Code Snippets

headers = file.gets.chomp.split(", ")
def initialize file
    @rows = []
    headers = file.gets.chomp.split(", ")
    file.each do |line|
      values = {}
      headers.zip(line.chomp.split(", ")).each do |key, value|
        values[key] = value
      end
      @rows << CsvRow.new(values)
    end
  end

Context

StackExchange Code Review Q#70506, answer score: 3

Revisions (0)

No revisions yet.