HiveBrain v1.2.0
Get Started
← Back to all entries
principlecppMinor

VS 2015 std::char_traits<char16_t> operations

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
std2015operationschar16_tchar_traits

Problem

At my workplace, we changed string type (which holds internationalized characters) for from std::wstring to std::u16string after VS 2015(Update 3) compiler upgrade.

Due to this, we are seeing loads of performance regressions such as this.

The profiler analysis reveals that std::u16string's std::char_traits operations such as copy, compare, find and assign are the most hit and are taking longer than std::wstring's std::char_traits counterparts.

These std::char_traits operations are written in terms of std::wmem* and std::char_traits operations are written in terms of for loops.

If we change these traits operations for char16_t type (or std::u16string) to use our own customized traits, we are seeing performance improvements with performance comparable to std::wstring.

We are planning to write our own custom traits (until MS fixes it for next version of VS) as follows

struct string_custom_traits : public std::char_traits
{
    static const char16_t * copy(char16_t* dest, const char16_t* src, size_t count)
    {
        return (count == 0 ? src : (char16_t*)std::memcpy(dest, src, count * sizeof(char16_t)));
    }

 };


Would that be OK? Are there any problems with this approach ?

Solution

Nothing in std::char_traits::copy says to return src when count is zero. Further, we must return a pointer to modifiable char_type, which also precludes returning src.

So I think we should write:

#include 
#include 

struct string_custom_traits : public std::char_traits
{
    static char_type * copy(char_type* dest, const char_type* src, size_t count)
    {
        return (char_type*)std::memcpy(dest, src, count * sizeof *src);
    }

    static char_type *move(char_type* dest, const char_type* src, size_t count)
    {
        return (char_type*)std::memmove(dest, src, count * sizeof *src);
    }
};


I've used the template-provided alias char_type for consistency and in case you want to turn this traits class into a template itself.

Make sure you understand and accept the differences between UCS-2 and UTF-16 if you handle characters outside the BMP.

Code Snippets

#include <cstring>
#include <string>

struct string_custom_traits : public std::char_traits<char16_t>
{
    static char_type * copy(char_type* dest, const char_type* src, size_t count)
    {
        return (char_type*)std::memcpy(dest, src, count * sizeof *src);
    }

    static char_type *move(char_type* dest, const char_type* src, size_t count)
    {
        return (char_type*)std::memmove(dest, src, count * sizeof *src);
    }
};

Context

StackExchange Code Review Q#160901, answer score: 3

Revisions (0)

No revisions yet.