HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Converting std::string to int without Boost

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
stdwithoutboostconvertingintstring

Problem

I'm trying to convert an std::string to an int with a default value if the conversion fails. C's atoi() is way too forgiving and I've found that boost::lexical_cast is quite slow in the case where the cast fails. I imagine it's because an exception is raised. I don't have C++11, so stoi() is out.

The Delphi function StrToIntDef is the ideal, and it's also available in C-Builder (where I'm working currently). But I want something more portable that works with std::string.

I've created the following function, which uses atoi() but first tests for error conditions. It also allows for leading and trailing spaces, which are harmless in my situation.

int stringtoIntDef(const std::string & sValue, const int & DefaultValue) {
// convert a std::string to integer with a default value returned
// in the case where the string doesn't represent a valid integer 
// - accepts leading or trailing spaces as valid

   bool hasDigits = false;
   bool TrailingSpace = false;
   for (std::string::size_type k = 0; k < sValue.size(); ++k) {
      if ((sValue[k] == ' ') || (sValue[k] == '\t')) {
         TrailingSpace = hasDigits;
      } else if ((sValue[k] == '0') || (sValue[k] == '1') || (sValue[k] == '2') || (sValue[k] == '3') ||
                  (sValue[k] == '4') || (sValue[k] == '5') || (sValue[k] == '6') || (sValue[k] == '7') ||
                  (sValue[k] == '8') || (sValue[k] == '9')) {
         if (TrailingSpace) {
            return DefaultValue;
         } else {
            hasDigits = true;
         }
      } else if ((sValue[k] == '-') && !hasDigits) {
         hasDigits = true; // this protects against "--"
      } else {
         return DefaultValue;
      }
   }
   return atoi(sValue.c_str());
}


In my testing, I've compared it against StrToIntDef and it is just as fast, but lexical_cast is much slower in the case where the default is returned. For 1000 iterations, lexical_cast took 5 seconds while the other 2 weren't measurable.

For

Solution

There's a few things that jump out at me both in your implementation and standard options you missed. A few (scattered) thoughts follow.

Returning a default if you care about error checking doesn't make sense. How would you determine if conversion failed if there's not an integer available for use as a sentinel? You are basically doing the same thing atoi does except your default is flexible instead of 0.

Have you considered [strtol][1]? It's basically atoi with error checking capabilities. It's the fastest string to int converter you're likely going to get:

long string_to_long(const std::string& str, long default_value, int base = 10) {
    char* parse_end = NULL; //nullptr if C++11, though really you don't have to initialize this
                            //since strol is guaranteed to initialize it. I just like to avoid
                            //unitialized variables.
    long val = strtol(str.c_str(), &parse_end, base);
    if (parse_end == str.c_str() + str.size()) {
        return val;
    } else {
        return default_value;
    }
}


Note: strtol ignores leading whitespace. The function I've written above errors for trailing whitespace (or trailing anything). In other words, " 3" is fine, but "3 " would error.

The idiomatic way to do this in C++ would be to use a string stream:

int string_to_int(const std::string& str, int def)
{
    std::istringstream ss(str);
    int i;
    if (!(ss >> std::noskipws >> i)) {
        //Extracting an int failed
        return def;
    }
    char c;
    if (ss >> c) {
        //There was something after the int
        return def;
    }
    return i;
}


In terms of performance, strtol will likely be faster than this. This does have a useful advantage though that it can be made highly generic with simple templating (this is the essential idea behind boost::lexical_cast actually, but this doesn't use exceptions):

template
ResultType lexical_cast(const std::string& str, const ResultType& default_value = ResultType()) {
    std::istringstream ss(str);
    ResultType result;
    if (!(ss >> std::noskipws >> result)) {
        return default_value;
    }
    char c;
    if (ss >> c) {
        return default_value;
    }
    return result;
}


If you don't want to use exceptions, but you want to be able to detect errors, you're going to have to use a flag. You basically have two options of how to handle that. You can use a boolean flag and return the value, or you can use a boolean reference parameter and return the value. Which route you take really depends on how you want to use the code, but I would probably return the value. That lets you ignore errors if you want to, but it still doesn't force you to. I might do something like this:

template
ResultType lexical_cast(const std::string& str, bool& success) {
    std::istringstream ss(str);
    ResultType result;
    success = true;
    if (!(ss >> std::noskipws >> result)) {
        success = false;
    }
    char c;
    if (ss >> c) {
        success = false;
    }
    return result;
}

template
ResultType lexical_cast(const std::string& str) {
    bool ignore;
    return lexical_cast(str, ignore);
}


This gives you both options of detecting an error or ignoring one. You could even build more on top of this to have one that returns a default value:

template
ResultType lexical_cast(const std::string& str, const ResultType& default_value = ResultType()) {
    bool success;
    ResultType val(lexical_cast(str, ignore));
    if (success) {
        return val;
    } else {
        return default_value;
    }
}


Don't check for classes of characters manually. Just use the cctype header's functions (isspace, isdigit, etc).

You loop is way over complicated:

bool all_digit = true;
for (std::string::size_type i = 0, l = str.size(); i < l; ++i) {
    if (!std::isdigit(str[i])) { 
       all_digit = false;
        break;
    }
}

if (all_digit) { ... }


Or, allowing leading/trailing whitespace:

std::string::size_t i = 0;
const std::string::size_t len = str.size();
for (; i < len && std::isspace(str[i]); ++i) { /* empty */ }
for (; i < len && std::isdigit(str[i]); ++i) { /* empty */ }
for (; i < len && std::isspace(str[i]); ++i) { /* empty */ }
const bool all_digits = (i == len);

Code Snippets

long string_to_long(const std::string& str, long default_value, int base = 10) {
    char* parse_end = NULL; //nullptr if C++11, though really you don't have to initialize this
                            //since strol is guaranteed to initialize it. I just like to avoid
                            //unitialized variables.
    long val = strtol(str.c_str(), &parse_end, base);
    if (parse_end == str.c_str() + str.size()) {
        return val;
    } else {
        return default_value;
    }
}
int string_to_int(const std::string& str, int def)
{
    std::istringstream ss(str);
    int i;
    if (!(ss >> std::noskipws >> i)) {
        //Extracting an int failed
        return def;
    }
    char c;
    if (ss >> c) {
        //There was something after the int
        return def;
    }
    return i;
}
template<typename ResultType>
ResultType lexical_cast(const std::string& str, const ResultType& default_value = ResultType()) {
    std::istringstream ss(str);
    ResultType result;
    if (!(ss >> std::noskipws >> result)) {
        return default_value;
    }
    char c;
    if (ss >> c) {
        return default_value;
    }
    return result;
}
template<typename ResultType>
ResultType lexical_cast(const std::string& str, bool& success) {
    std::istringstream ss(str);
    ResultType result;
    success = true;
    if (!(ss >> std::noskipws >> result)) {
        success = false;
    }
    char c;
    if (ss >> c) {
        success = false;
    }
    return result;
}

template<typename ResultType>
ResultType lexical_cast(const std::string& str) {
    bool ignore;
    return lexical_cast<ResultType>(str, ignore);
}
template<typename ResultType>
ResultType lexical_cast(const std::string& str, const ResultType& default_value = ResultType()) {
    bool success;
    ResultType val(lexical_cast<ResultType>(str, ignore));
    if (success) {
        return val;
    } else {
        return default_value;
    }
}

Context

StackExchange Code Review Q#35388, answer score: 8

Revisions (0)

No revisions yet.