patternrubyMinor
CSV File Parser in Ruby
Viewed 0 times
parsercsvfileruby
Problem
The following is my Ruby attempt at a (very) basic CSV file parser class, inspired by an exercise from the book Seven Languages in Seven Weeks. I'm a Ruby novice and will be grateful for any suggestions for improvement.
#!/usr/local/bin/ruby -w
# CSV file parser.
#
# USAGE:
# CsvFile.new('data.csv').each { |row| puts row.firstname }
#
class CsvFile
# Parses the given CSV file into a collection of rows.
#
def initialize file
@rows = []
headers = file.gets.chomp.split(", ")
file.each do |line|
values = {}
headers.zip(line.chomp.split(", ")).each do |key, value|
values[key] = value
end
@rows a hash containing the column -> value mapping
#
def initialize values
@values = values
end
# Returns the value in the column given as method name, or null if
# no such value exists.
#
def method_missing name, *args
@values[name.to_s]
end
end
# TEST CASES
#
class AssertionError < RuntimeError
end
def assert &block
raise AssertionError unless yield
end
require 'stringio'
file = StringIO.new(
"firstname, lastname, age, sex
Andrej, Beles, 25, male
Delia, Marin, 20, female
Henry, Prahanth, 33, male"
)
csvFile = CsvFile.new(file)
row = csvFile[0]
assert { row.firstname == "Andrej" }
assert { row.lastname == "Beles" }
assert { row.age == "25" }
assert { row.sex == "male" }
row = csvFile[1]
assert { row.firstname == "Delia" }
assert { row.lastname == "Marin" }
assert { row.age == "20" }
assert { row.sex == "female" }
row = csvFile[2]
assert { row.firstname == "Henry" }
assert { row.lastname == "Prahanth" }
assert { row.age == "33" }
assert { row.sex == "male" }
puts "DONE."Solution
You've got a bug here.
There is no requirement in a *.csv file that there be a space after the comma between items. In fact, if there is a space, that space should be considered part of the item, not part of the delimiter.
The next thing I notice is that comma separated files are not strictly separated by commas. Other delimiters are possible, and likely to be come across. Things such as pipes
The process of implementing this should clean up the string literal duplication you have here.
But minimally, you should replace
headers = file.gets.chomp.split(", ")There is no requirement in a *.csv file that there be a space after the comma between items. In fact, if there is a space, that space should be considered part of the item, not part of the delimiter.
The next thing I notice is that comma separated files are not strictly separated by commas. Other delimiters are possible, and likely to be come across. Things such as pipes
| are common. I would consider supporting other delimiters. For example, Excel supports Tabs, Semicolons, Commas, and Spaces, along with an option for a custom defined delimiter. You might not want to fuss with supporting user defined delimiters, but certainly your class becomes more useful if you support the ones I've mentioned.The process of implementing this should clean up the string literal duplication you have here.
def initialize file
@rows = []
headers = file.gets.chomp.split(", ")
file.each do |line|
values = {}
headers.zip(line.chomp.split(", ")).each do |key, value|
values[key] = value
end
@rows << CsvRow.new(values)
end
endBut minimally, you should replace
", " with a constant value so you never accidentally change the delimiter in one place, but not the other.Code Snippets
headers = file.gets.chomp.split(", ")def initialize file
@rows = []
headers = file.gets.chomp.split(", ")
file.each do |line|
values = {}
headers.zip(line.chomp.split(", ")).each do |key, value|
values[key] = value
end
@rows << CsvRow.new(values)
end
endContext
StackExchange Code Review Q#70506, answer score: 3
Revisions (0)
No revisions yet.