patterncppMinor
C++ and STL - Machine Learning Problem
Viewed 0 times
problemstllearningmachineand
Problem
I would like to get some general comments on style and use of STL in particular. This is some code I wrote to do machine learning classification (logistic regression). Any suggestions would be very appreciated!
A record containing an individual training example:
A collection of
A record containing an individual training example:
/*
* A class representing a single record to be used for training or evaluation
* of a classifier. This record contains a vector of predictors which are
* observations used to predict an outcome, together with a target outcome for
* that observation. The machine learning algorithm's job is to learn to
* generalize this predictor to target outcome relationship.
*/
#include "Record.h"
using namespace std;
/**
* Construct a record from the provided predictor Vector and target outcome.
*/
Record::Record(const Vector& predictor, double target) :
predictor_(predictor), target_(target) {
}
/**
* Construct a record by parsing the provided string. The string must contain
* a set of space seperated values, with the target outcome in column kTargetCol
* and the predictor vector starting on column kPredictorCol. Note that each
* element of the predictor vector is preceeded by an index and a colon. So
* the string looks has the following form:
* 1: 2: ...
*/
Record::Record(const string& line) {
Init(line);
}
void Record::Init(const string& line) {
stringstream line_stream(line);
string token;
for (int i = 0; getline(line_stream, token, ' '); ++i) {
if (i == kTargetCol) {
istringstream istream(token);
double target;
istream >> target;
target_ = target;
}
if (i >= kPredictorCol) {
istringstream istream(token.substr(token.find(':') + 1));
double element;
istream >> element;
predictor_.push_back(element);
}
}
}
/** Convert a record into a string suitable for display */
string Record::String() const {
ostringstream sstream;
sstream << target_;
return sstream.str() + ", " + predictor_.String();
}A collection of
Solution
Edit (based on the large number of additions) I am re-writing this:
General Notes:
You should add header guards to all your *.h files.
Vector.h
Don't like your use of string conversion methods. Very Java like and not C++ like at all. Prefer to write stream operators which are much more versatile.
Hide your implementation details by using local typedefs. Also it is more traditional to just call the method to get a const iterator
This works but it also allows some expressions that don't do much;
Not totally convinced that setting up these operators will make using the class easier to read.
Surprised this compiles at all here:
The header file should include all the header files it need to make the required types available. There is no
There are a couple of explanations to both these problems none of them good.
NEVER put
NEVER make a header file depend on being included in the correct order.
Vector.cpp
I would use the correct type. here.
As described above I would return by const reference.
Here I would not convert to a string.
I would a stream operator (then use some algorithms inside to make it cleaner).
I will add more when I get some time.
Old review.
Rather than write a to_string() method:
Write a stream operator.
If you really just want a string use lexical cast:
I don't see the need for a file-stream member that is kept open for the life of the object. Open it use it then let it fall out of scope:
Here you are exposing an implementation detail:
The user of your class should not need to know that you are using a vector internal to your class
``
General Notes:
You should add header guards to all your *.h files.
Vector.h
Don't like your use of string conversion methods. Very Java like and not C++ like at all. Prefer to write stream operators which are much more versatile.
string String() const;
// Prefer
friend std::ostream& operator<<(std::ostream stream, Vector const& data)
{
return stream;
}Hide your implementation details by using local typedefs. Also it is more traditional to just call the method to get a const iterator
begin() rather than const_begin() etc.vector::const_iterator const_begin() const;
vector::iterator begin();
// Prefer
typedef std::vector Container;
typedef Container::iterator iterator;
typedef Container::const_iterator const_iterator;
iterator begin();
const_iterator begin() const;This works but it also allows some expressions that don't do much;
double operator[](int index) const;
// Now the compiler will allow:
Vector x;
x[5] = 6; // Will compile but not do anything.
// I would prefer this fails to compile so that the
// user of the class will know that they have done
// something wrong immediately.
// Thus I would prefer this declaration:
double const& operator[](std::size_t index) const;Not totally convinced that setting up these operators will make using the class easier to read.
Vector operator+(const Vector& operand) const;
Vector operator-(double scalar) const;
Vector operator-(const Vector& operand) const;
Vector operator*(double scalar) const;
Vector operator/(double scalar) const;
Vector operator/(const Vector& operand) const;Surprised this compiles at all here:
private:
vector elements;The header file should include all the header files it need to make the required types available. There is no
#include in this file. Also you do not prefix vector with std:: so how does it know where it is defined.There are a couple of explanations to both these problems none of them good.
NEVER put
using namespace std; in a header file.NEVER make a header file depend on being included in the correct order.
Vector.cpp
Vector::Vector() : elements(vector()) {}
// Easier to write:
Vector::Vector() : elements() {}
Vector::Vector(int size) : elements(vector(size, 0)) {}
// Easier to write
Vector::Vector(int size) : elements(size) {} // Value defaults to 0I would use the correct type. here.
int Vector::size() const {return elements.size();}
// I would use:
std::size_t Vector::size() const {return elements.size();}As described above I would return by const reference.
double Vector::operator[](int index) const {
return elements[index];
}Here I would not convert to a string.
I would a stream operator (then use some algorithms inside to make it cleaner).
/** Convert a Vector into a string suitable for display. */
string Vector::String() const {
friend std::ostream& operator(stream," "));
stream << "] ";
return stream;
}I will add more when I get some time.
Old review.
Rather than write a to_string() method:
string Record::String() const {
ostringstream sstream;
sstream << target_;
return sstream.str() + ", " + predictor_.String();
}Write a stream operator.
std::ostream& operator<<(std::stream& stream, Record const& data)
{
return stream << data.target_ << ", " << data.predictor_;
}If you really just want a string use lexical cast:
Record record(/* Initialization*/);
// lexical_cast use the stream operator.
std::string data = boost::lexical_cast(record);I don't see the need for a file-stream member that is kept open for the life of the object. Open it use it then let it fall out of scope:
RecordList::RecordList(const string& filename) {
input_file_.open(filename.c_str(), ios::in);
Init();
}
void RecordList::Init() {
for (string line; getline(input_file_, line);) {
Record record(line);
records_.push_back(record);
}
}
RecordList::~RecordList() {
input_file_.close();
}
I would just do:
RecordList::RecordList(const string& filename) {
Init(filename);
}
void RecordList::Init(std::string const& filename) {
std::ifstream input_file(filename.c_str());
for (string line; getline(input_file, line);) {
Record record(line);
records_.push_back(record);
}
}
RecordList::~RecordList() {}Here you are exposing an implementation detail:
vector::const_iterator RecordList::const_begin() const {The user of your class should not need to know that you are using a vector internal to your class
``
class RecordList
{
public:
typedef std::vector Container;
typedef Container::iterator iterator;
typedef Container::const_iterator const_iterator;
const_iterator RecordList::const_begin() const;
};
Code Snippets
string String() const;
// Prefer
friend std::ostream& operator<<(std::ostream stream, Vector const& data)
{
return stream;
}vector<double>::const_iterator const_begin() const;
vector<double>::iterator begin();
// Prefer
typedef std::vector<double> Container;
typedef Container::iterator iterator;
typedef Container::const_iterator const_iterator;
iterator begin();
const_iterator begin() const;double operator[](int index) const;
// Now the compiler will allow:
Vector x;
x[5] = 6; // Will compile but not do anything.
// I would prefer this fails to compile so that the
// user of the class will know that they have done
// something wrong immediately.
// Thus I would prefer this declaration:
double const& operator[](std::size_t index) const;Vector operator+(const Vector& operand) const;
Vector operator-(double scalar) const;
Vector operator-(const Vector& operand) const;
Vector operator*(double scalar) const;
Vector operator/(double scalar) const;
Vector operator/(const Vector& operand) const;private:
vector<double> elements;Context
StackExchange Code Review Q#18002, answer score: 6
Revisions (0)
No revisions yet.