HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Parsing a string as fast as possible that is using comma delimiters

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
fastcommapossibleparsingthatusingstringdelimiters

Problem

I have made a simple static parser for a bigger program. This parser takes a string that has comma deliminators and I need to separate the strings in to a vector for handling in a different section of the code.

I would like this to run as fast as possible so the rest of the program isn't tied up separating strings. I also need this parser to be robust, so I can't set how many columns it should have or anything like that.

For the life of me however, I can't seem to figure out how I can make this any less of a O notation. Right now it is O(N) because of the for loop, if there is something in the string. So worse case, the function is O(N). Is there any way to speed it up? I made this, because I have some special cases where if something has nothing between the commas IE ",," then it still needs to make a spot in the vector so the ",," can be placed in a CSV. Also if we need the parser to ignore a comma between quotes it can do that by setting a bool value.

The main problem, or what I would like to see is, is there a way to reduce the complexity of the function to less than \$O(N)\$?

```
//pass in a string and a vector and if quotes should be used or not and what char should be used for delimation.
void Strtov::parse(vector& inboundVector, const string& stringToBeParsed, bool quotesUsed, const char& charToSepBy)
{
string temporary = "";
//vector retVal;
bool quoteTriggered = false;
//if we get an empty string do nothing.
if (stringToBeParsed.size() > 0) {
//if the string has something in it then go ahead down the list.
for (size_t i = 0; i < stringToBeParsed.size(); i++) {

if (stringToBeParsed[i] == '\"' && quotesUsed) {
//if they are true, then mark them false, if they are false then mark them true
quoteTriggered = quoteTriggered ? false : true;
if (quoteTriggered) {
temporary.push_back(stringToBeParsed[i]);
}
e

Solution

-
There is no way to go under \$O(N)\$, just because you need to inspect the whole string.

-
The test if (stringToBeParsed.size() > 0) does not speed up the execution; you may safely omit it.

-
quoteTriggered = quoteTriggered ? false : true; is a long way to say quoteTriggered = !quoteTriggered.

-
In

if (quoteTriggered) {
        temporary.push_back(stringToBeParsed[i]);
    }   
    else {
        //place quote in to the temporary string and store it.
        temporary.push_back(stringToBeParsed[i]);
    }


the character is pushed anyway, so testing for quoteTriggered is pointless.

-
The remaining flow in the loop can be streamlined as:

if quoteIsTriggered or char_is_not_separator
        push_to_temporary
    else
        substring is parsed


-
Upon the loop termination, you don't need to inspect the last character - if it was comma, temporary will be conveniently empty. Just push temporary and be done.

Putting it all together:

void parse(vector& inboundVector, const string& stringToBeParsed, bool quotesUsed, const char& charToSepBy)
    {
        string temporary = "";
        bool quoteTriggered = false;
        for (size_t i = 0; i < stringToBeParsed.size(); i++) {
            if (stringToBeParsed[i] == '\"' && quotesUsed) {
                quoteTriggered = !quoteTriggered;
                temporary.push_back(stringToBeParsed[i]);
                continue;
            }

            if (quoteTriggered || stringToBeParsed[i] != charToSepBy) {
                temporary.push_back(stringToBeParsed[i]);
            } else {
                inboundVector.push_back(temporary);
                temporary = "";
            }
        }

        inboundVector.push_back(temporary);
    }

Code Snippets

if (quoteTriggered) {
        temporary.push_back(stringToBeParsed[i]);
    }   
    else {
        //place quote in to the temporary string and store it.
        temporary.push_back(stringToBeParsed[i]);
    }
if quoteIsTriggered or char_is_not_separator
        push_to_temporary
    else
        substring is parsed
void parse(vector<string>& inboundVector, const string& stringToBeParsed, bool quotesUsed, const char& charToSepBy)
    {
        string temporary = "";
        bool quoteTriggered = false;
        for (size_t i = 0; i < stringToBeParsed.size(); i++) {
            if (stringToBeParsed[i] == '\"' && quotesUsed) {
                quoteTriggered = !quoteTriggered;
                temporary.push_back(stringToBeParsed[i]);
                continue;
            }

            if (quoteTriggered || stringToBeParsed[i] != charToSepBy) {
                temporary.push_back(stringToBeParsed[i]);
            } else {
                inboundVector.push_back(temporary);
                temporary = "";
            }
        }

        inboundVector.push_back(temporary);
    }

Context

StackExchange Code Review Q#124506, answer score: 6

Revisions (0)

No revisions yet.