HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

String algorithms and locale

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
andstringlocalealgorithms

Problem

I am trying to write some string algorithms which I want to work for any kinds of strings and/or locale. I managed to get some results that do work, but I am not sure that what I am doing is idiomatic or anything. Here is a function:

template
auto upper(String str,
           const std::ctype& f =
           std::use_facet>(std::locale{}))
    -> String
{
    f.toupper(&str[0], &str[0]+str.size());
    return str;
}


I have problems getting around the way C++ works with locales and UTF-8. This algorithm is intended to take a string (well, a range of characters with a standard string interface) and returning a copy of it with all the characters converted to uppercase according to a given locale. If no locale (ctype facet) is given, the last used locale is used.

Is there anyway to improve this code (taking locale vs facet) for examples so that it would be the most useful?

Solution

I think I'd prefer to have the user pass a locale instead of a facet. In most typical cases, a user will deal only with locales, not with the individual facets that make up a particular locale. It's also relatively easy for a user to create a locale on demand, so the code looks something like:

std::string input{"Now is the time for every good man to come of the aid of his country."};

std::string result = upper(input, std::locale("en-us"));


or:

std::string result = upper(input, std::locale(""));


...though I suppose we can probably expect that the default parameter will be used a lot more often than not, in which case all of this is moot.

Anyway, using a locale as the parameter lets us move the use_facet call inside the function body and use auto for it instead of writing out its full type:

auto const& f = std::use_facet>(locale);


Not a huge change, but somewhat simpler nonetheless. Given that you're depending on String being contiguous and supporting random access, it might be worth considering making use of that a little more explicitly, by replacing &str[0]+str.size() with &str[str.size()].

Although I'm somewhat on the fence about it, I'm also less than excited about the idea of using a trailing return type when it's not actually necessary. Maybe it's just a sign of my age, but I still tend to prefer the return type in its traditional location when possible.

Putting all those together, we'd end up with something on this order:

template
String upper(String str, std::locale const &locale = std::locale())
{
    auto const& f = std::use_facet>(locale);
    f.toupper(&str[0], &str[str.size()]);
    return str;
}


I'm not sure anybody could call that a huge improvement, but I do think it's at least a minor one, especially in convenience to the user.

Code Snippets

std::string input{"Now is the time for every good man to come of the aid of his country."};

std::string result = upper(input, std::locale("en-us"));
std::string result = upper(input, std::locale(""));
auto const& f = std::use_facet<std::ctype<typename String::value_type>>(locale);
template<typename String>
String upper(String str, std::locale const &locale = std::locale())
{
    auto const& f = std::use_facet<std::ctype<typename String::value_type>>(locale);
    f.toupper(&str[0], &str[str.size()]);
    return str;
}

Context

StackExchange Code Review Q#38272, answer score: 3

Revisions (0)

No revisions yet.