HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Reading in a file and performing string manipulation

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
performingreadingfilemanipulationandstring

Problem

In a question I answered I posted the following code:

#include 
#include 
#include 
#include 

int main()
{
    fstream iFile("names.txt", ios::in);

    // This does not work so replacing with code that does
    // iFile >> file;

    // This is not the best way.
    // But the closest to the intention of the original post.

    // Get size of file.
    ifile.seekg(0,std::ios::end);
    std::streampos          length = ifile.tellg();
    ifile.seekg(0,std::ios::beg);

    // Copy file into string.
    std::string       file(length, '\0');
    ifile.read(&buffer[0],length);

    // Original code continues.

    std::istringstream ss(file);
    std::string token;

    std::vector names;

    while(std::getline(ss, token, ',')) {
        names.push_back(token);
    }

    for (unsigned int i = 0; i < names.size(); i++) {
         auto it = std::remove_if(names[i].begin(), names[i].end(), [&] (char c) { return c == '"'; });
         names[i] = std::string(names[i].begin(), it);
    }

    for (unsigned int i = 0; i < names.size(); i++) {
        std::cout << "names["<<i<<"]: " << names[i] << std::endl;
    }
}


Would this be inefficient for larger files? Would it be better if I read the file into one string, did all the manipulation on that, and then put it into a vector?

Solution

This code looks a little wild. I'd tame it like this:

#include  // for std::remove
#include    // for std::ifstream
#include     // for std::string and std::getline
#include    // for std::move
#include     // for std::vector

std::ifstream infile("thefile.txt");    // we don't really care where you got the name from

std::vector names;

for (std::string line; std::getline(infile, line, ','); )
{
    line.erase(std::remove(line.begin(), line.end(), '"'), line.end());
    names.push_back(std::move(line));
}

// done


A few points to note:

-
Keep it simple. Don't overthink.

-
Use ready-made algorithms, don't reinvent wheels (std::remove removes values).

-
Include what you use (std::string, std::getline).

-
Use algorithms, don't hand-roll your own loops. (erase/remove)

-
Understand what algorithms can do for you (don't hand-write your own "erase"). Get familiar with the standard library.

-
Don't leak scopes. The line string is only needed within the loop, and no further.

-
Efficiency tip: You can move from the line string that you no longer need and save yourself a copy. Also, we process everything in one step; no need to loop repeatedly.

Furthermore, I suggest factoring this logic into a file:

std::vector tokenize_file(std::string const & filename, char delimiter)
{
    std::vector result;

    // ...

    return result;
}


Usage:

std::vector names = tokenize_file("thefile.txt", ',');

Code Snippets

#include <algorithm> // for std::remove
#include <fstream>   // for std::ifstream
#include <string>    // for std::string and std::getline
#include <utility>   // for std::move
#include <vector>    // for std::vector

std::ifstream infile("thefile.txt");    // we don't really care where you got the name from

std::vector<std::string> names;

for (std::string line; std::getline(infile, line, ','); )
{
    line.erase(std::remove(line.begin(), line.end(), '"'), line.end());
    names.push_back(std::move(line));
}

// done
std::vector<std::string> tokenize_file(std::string const & filename, char delimiter)
{
    std::vector<std::string> result;

    // ...

    return result;
}
std::vector<std::string> names = tokenize_file("thefile.txt", ',');

Context

StackExchange Code Review Q#37901, answer score: 5

Revisions (0)

No revisions yet.