patterncppMinor
Reading in a file and performing string manipulation
Viewed 0 times
performingreadingfilemanipulationandstring
Problem
In a question I answered I posted the following code:
Would this be inefficient for larger files? Would it be better if I read the file into one string, did all the manipulation on that, and then put it into a vector?
#include
#include
#include
#include
int main()
{
fstream iFile("names.txt", ios::in);
// This does not work so replacing with code that does
// iFile >> file;
// This is not the best way.
// But the closest to the intention of the original post.
// Get size of file.
ifile.seekg(0,std::ios::end);
std::streampos length = ifile.tellg();
ifile.seekg(0,std::ios::beg);
// Copy file into string.
std::string file(length, '\0');
ifile.read(&buffer[0],length);
// Original code continues.
std::istringstream ss(file);
std::string token;
std::vector names;
while(std::getline(ss, token, ',')) {
names.push_back(token);
}
for (unsigned int i = 0; i < names.size(); i++) {
auto it = std::remove_if(names[i].begin(), names[i].end(), [&] (char c) { return c == '"'; });
names[i] = std::string(names[i].begin(), it);
}
for (unsigned int i = 0; i < names.size(); i++) {
std::cout << "names["<<i<<"]: " << names[i] << std::endl;
}
}Would this be inefficient for larger files? Would it be better if I read the file into one string, did all the manipulation on that, and then put it into a vector?
Solution
This code looks a little wild. I'd tame it like this:
A few points to note:
-
Keep it simple. Don't overthink.
-
Use ready-made algorithms, don't reinvent wheels (
-
Include what you use (
-
Use algorithms, don't hand-roll your own loops. (erase/remove)
-
Understand what algorithms can do for you (don't hand-write your own "erase"). Get familiar with the standard library.
-
Don't leak scopes. The
-
Efficiency tip: You can move from the
Furthermore, I suggest factoring this logic into a file:
Usage:
#include // for std::remove
#include // for std::ifstream
#include // for std::string and std::getline
#include // for std::move
#include // for std::vector
std::ifstream infile("thefile.txt"); // we don't really care where you got the name from
std::vector names;
for (std::string line; std::getline(infile, line, ','); )
{
line.erase(std::remove(line.begin(), line.end(), '"'), line.end());
names.push_back(std::move(line));
}
// doneA few points to note:
-
Keep it simple. Don't overthink.
-
Use ready-made algorithms, don't reinvent wheels (
std::remove removes values).-
Include what you use (
std::string, std::getline).-
Use algorithms, don't hand-roll your own loops. (erase/remove)
-
Understand what algorithms can do for you (don't hand-write your own "erase"). Get familiar with the standard library.
-
Don't leak scopes. The
line string is only needed within the loop, and no further.-
Efficiency tip: You can move from the
line string that you no longer need and save yourself a copy. Also, we process everything in one step; no need to loop repeatedly.Furthermore, I suggest factoring this logic into a file:
std::vector tokenize_file(std::string const & filename, char delimiter)
{
std::vector result;
// ...
return result;
}Usage:
std::vector names = tokenize_file("thefile.txt", ',');Code Snippets
#include <algorithm> // for std::remove
#include <fstream> // for std::ifstream
#include <string> // for std::string and std::getline
#include <utility> // for std::move
#include <vector> // for std::vector
std::ifstream infile("thefile.txt"); // we don't really care where you got the name from
std::vector<std::string> names;
for (std::string line; std::getline(infile, line, ','); )
{
line.erase(std::remove(line.begin(), line.end(), '"'), line.end());
names.push_back(std::move(line));
}
// donestd::vector<std::string> tokenize_file(std::string const & filename, char delimiter)
{
std::vector<std::string> result;
// ...
return result;
}std::vector<std::string> names = tokenize_file("thefile.txt", ',');Context
StackExchange Code Review Q#37901, answer score: 5
Revisions (0)
No revisions yet.