HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppModerate

Most efficient way in C++ to strip strings

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
efficientwaystripstringsmost

Problem

If I want to strip a string completely of its whitespaces, punctuation and numbers (i.e. anything that is not A-Z, a-z), what is the most efficient way of doing it in C++?

I tried this:

string strip(string in) {
    string final;
    for(int i = 0; i < in.length(); i++) {
        if(isalpha(in[i])) final += in[i];
    }
    return final;
}


It works as expected, but is too slow on strings with ~2000 characters. I figured out that the code causing this slowness is the isalpha() call.

So does anyone know of a better, more efficient way of stripping a string of everything except [A-Z][a-z] in C++?

At most, the string will be 20000 characters long and I need to strip it in <1 second.

Thanks in advance.

EDIT:

If I remove the if condition, the output will display instantly. But with the if condition, it will take about 1.6 seconds to display the output.

For trying out the code, use this: http://pastebin.com/g3NtBFaD and a normal 20k char string. Then try comparing.

Solution

A few thoughts which come to my mind, without having actually profiled your code:

  • Try passing std::string as reference-to-const to avoid a copy (in case your std::string implementation is not Copy-On-Write).



  • Reserve space in the std::string by calling reserve.



  • Avoid calling std::string::length repeatedly, memorize the value.



  • Avoid indexing the string repeatedly, use an iterator instead.



For what it's worth, you could try a different (more functional) way to implement this function. Some may consider this idiomatic, other will find it harder to read. Your call -maybe just for the fun of it, to see how it performs (remember to enable optimizations!):

#include 
#include 
#include 
#include 

std::string strip( const std::string &s ) {
    std::string result;
    result.reserve( s.length() );

    std::remove_copy_if( s.begin(),
                         s.end(),
                         std::back_inserter( result ),
                         std::not1( std::ptr_fun( isalpha ) ) );

    return result;
}

Code Snippets

#include <algorithm>
#include <functional>
#include <locale>
#include <string>

std::string strip( const std::string &s ) {
    std::string result;
    result.reserve( s.length() );

    std::remove_copy_if( s.begin(),
                         s.end(),
                         std::back_inserter( result ),
                         std::not1( std::ptr_fun( isalpha ) ) );

    return result;
}

Context

StackExchange Code Review Q#11203, answer score: 17

Revisions (0)

No revisions yet.