HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Class to read comma separated data from disk

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
classcommadiskreadseparatedfromdata

Problem

After programming a lot in high level languages such as Python and R, I started working with C++ recently. To get my C++ skills up, I of course read a lot of books. In addition, I try and replicate some functionality from high level languages in C++. My first attempt at this (after Hello World :)) was to create a C++ class which could read comma separated files. In essence the class parses the file and reads it into a boost::MultiArray. I would appreciate any feedback on the code.

The links to the code and example data are listed below, they link to my bitbucket account.

The actual code comprises of a cpp file (csv-reader.cpp):

```
#include
#include
#include
#include
#include
#include
#include
#include "csv-test.h"
#include

template class csv_reader {
boost::multi_array content2d ;
std::vector split_line ;
std::string line;
std::string sep ;
int ncol ;
int nrow ;
public :
csv_reader(std::string, std::string) ; // constructor
~csv_reader(); // desctructor
void cout_content() ; // print the contents
T operator() (unsigned row, unsigned column) ;
} ;

// Constructor
template csv_reader::csv_reader(std::string path, std::string sep = ",")
{
// Initializing variables
ncol = 0 ; // Initialize the number of colums to 0
nrow = 1 ; // Initialize the number of rows to 1
content2d = boost::multi_array (boost::extents[0][0]) ;
std::ifstream data(path.c_str()) ;

// read the csv data
while(getline(data, line))
{
boost::split(split_line, line, boost::is_any_of(sep) ) ;
if(ncol == 0)
{
ncol = split_line.size() ;
}
else assert(ncol == split_line.size()) ;
content2d.resize(boost::extents[nrow][ncol]) ;
for(int i = 0; i csv_reader::~csv_reader() { }

template void csv_reader::cout_content()
{
for(int row = 0; row T csv_reader::operator() (unsigned row, unsigned column)
{
if (row >= nrow || column >= ncol)
throw BadIndex("boost::Multi

Solution

First of all, boost::split() is just a strtok() so it won't work with REAL csv files. CSV parsing is huge topic and it is better avoid this discussion here.

Second, convertToDouble() is not from C++, seems to be a visitor from C#. Even if I am wrong, it is design problem: template is defined for arbitrary class, but "silently" uses double for operation. It is necessary to create proper type casting, or use template specialization.

Strange usage of assert(). It does something in DEBUG builds only, in release build it is empty. If you need debug aids, just put assert() before content2d.resize(boost::extents[nrow][ncol]) ; and add any if that you need to handle the size problem for release builds. Consider throw an exception.

Destructor. You may drop both declaration and implementation of dtor and let the compiler create one for you, but I have a concern.

I do not know, where boost::multi_array allocates its memory, but if it uses stack, you may have troubles with big arrays. Because of this, it may be better to allocate content2d in heap (content2d = new boost::multi_array; and put delete content2d into destructor.

Constructor. It is better to use initializators list, so it may looks like (I agree with Loki about streams):

csv_reader(std::istream& str, std::string const& sep) : content2d( new boost::multi_array (boost::extents[0][0]))


I assume boost::multi_array* content2d declaration.

You do not need ncol and nrow variables, besides local variables in ctor: array contains information about its dimensions, and if you add separate variables for that, you just add another point of failure in your code.

Code Snippets

csv_reader(std::istream& str, std::string const& sep) : content2d( new boost::multi_array<T, 2> (boost::extents[0][0]))

Context

StackExchange Code Review Q#10128, answer score: 5

Revisions (0)

No revisions yet.