patterncppModerate
C++ string_cast<> template function
Viewed 0 times
string_castfunctiontemplate
Problem
In C++, to simplify string conversion between
```
#pragma once
#include
#include
#include
#include
#include
template
Td string_cast( const char* pSource, unsigned int codePage = CP_ACP );
template
Td string_cast( const wchar_t* pSource, unsigned int codePage = 1200 );
template
Td string_cast( const std::string& source, unsigned int codePage = CP_ACP );
template
Td string_cast( const std::wstring& source, unsigned int codePage = 1200 );
template<>
std::string string_cast( const char* pSource, unsigned int codePage )
{
assert( pSource != 0 );
return std::string( pSource );
}
template<>
std::wstring string_cast( const char* pSource, unsigned int codePage )
{
assert( pSource != 0 );
std::size_t sourceLength = std::strlen( pSource );
if( sourceLength == 0 )
{
return std::wstring();
}
int length = ::MultiByteToWideChar( codePage, 0, pSource, sourceLength, NULL, 0 );
if( length == 0 )
{
return std::wstring();
}
std::vector buffer( length );
::MultiByteToWideChar( codePage, 0, pSource, sourceLength, &buffer[ 0 ], length );
return std::wstring( buffer.begin(), buffer.end() );
}
template<>
std::string string_cast( const wchar_t* pSource, unsigned int codePage )
{
assert( pSource != 0 );
size_t sourceLength = std::wcslen( pSource );
if( sourceLength == 0 )
{
return std::string();
}
int length = ::WideCharToMultiByte( codePage, 0, pSource, sourceLength, NULL, 0, NULL, NULL );
if( length == 0 )
{
return std::string();
}
std::vector buffer( length );
::WideCharToMultiByte( codePage, 0, pSource, sourceLength, &buffer[ 0 ], length, NULL, NULL );
return std::string( buffer.begin(), buffer.end() );
}
template<>
std::wstring string_cast( const wchar_t* pSource, unsigned int codePage )
{
assert( pSource != 0 );
return std::wstring( pSo
std::string and std::wstring, I created the following utility template functions:```
#pragma once
#include
#include
#include
#include
#include
template
Td string_cast( const char* pSource, unsigned int codePage = CP_ACP );
template
Td string_cast( const wchar_t* pSource, unsigned int codePage = 1200 );
template
Td string_cast( const std::string& source, unsigned int codePage = CP_ACP );
template
Td string_cast( const std::wstring& source, unsigned int codePage = 1200 );
template<>
std::string string_cast( const char* pSource, unsigned int codePage )
{
assert( pSource != 0 );
return std::string( pSource );
}
template<>
std::wstring string_cast( const char* pSource, unsigned int codePage )
{
assert( pSource != 0 );
std::size_t sourceLength = std::strlen( pSource );
if( sourceLength == 0 )
{
return std::wstring();
}
int length = ::MultiByteToWideChar( codePage, 0, pSource, sourceLength, NULL, 0 );
if( length == 0 )
{
return std::wstring();
}
std::vector buffer( length );
::MultiByteToWideChar( codePage, 0, pSource, sourceLength, &buffer[ 0 ], length );
return std::wstring( buffer.begin(), buffer.end() );
}
template<>
std::string string_cast( const wchar_t* pSource, unsigned int codePage )
{
assert( pSource != 0 );
size_t sourceLength = std::wcslen( pSource );
if( sourceLength == 0 )
{
return std::string();
}
int length = ::WideCharToMultiByte( codePage, 0, pSource, sourceLength, NULL, 0, NULL, NULL );
if( length == 0 )
{
return std::string();
}
std::vector buffer( length );
::WideCharToMultiByte( codePage, 0, pSource, sourceLength, &buffer[ 0 ], length, NULL, NULL );
return std::string( buffer.begin(), buffer.end() );
}
template<>
std::wstring string_cast( const wchar_t* pSource, unsigned int codePage )
{
assert( pSource != 0 );
return std::wstring( pSo
Solution
While your approach is simple and straightforward to implement there are some important drawbacks to realize.
First, it doesn't take advantage of the power templates offer. Since you're specializing for every possible usage of
The approach does not lend itself to extendibility. What happens if you want to add support for another string type later on? The combination of functions you have to write would explode!
So there're clearly opportunities for some major improvement. Let see how we can refactor this so that it better adheds to DRY. If you take a step back and think about how
Each of these cases can be handle by writing a template for them. Starting with the string_cast function that acts as an interface:
string_cast now takes 2 template parameters. Keeping your naming convention, I use
We use type deduction to identify what we're casting from.
Once we know what type
Let's handle the easy case first:
For casting to the same string type, we don't need to do anything. Just return back what was given. Since this is nothing more than a pass-through returning by reference is ok.
Now for the important case, the reason for writing
You can extract those differences into a trait-like policy class.
Here I removed the
Here we use our policy class to tell us the proper character-type to use for our buffer. If
Similiarly,
We define our string_traits policies like this:
Declare the general base-form but don't define it. This way if code tries to cast from an illegit string-type it will give a compile error.
You might want to play around with the parameters it accepts but this should give you the general idea.
And now for the last case. For raw pointer types we can just wrap it into an appropriate string type and call one of our above string functions. We have to overload
Notice I'm using
First, it doesn't take advantage of the power templates offer. Since you're specializing for every possible usage of
string_cast there's no opportunity for the compiler to generate code for you. A consequence of this is that you have a lot of 'clipboard heritance', copy and pasting the same function and changing parts of it to do what you want. This is a form of code duplication.The approach does not lend itself to extendibility. What happens if you want to add support for another string type later on? The combination of functions you have to write would explode!
So there're clearly opportunities for some major improvement. Let see how we can refactor this so that it better adheds to DRY. If you take a step back and think about how
string_cast is used you'll find there are really just 3 scenarios it has to support:- cast to the same string type.
- cast to a different string type.
- cast from a raw pointer representation into a string type.
Each of these cases can be handle by writing a template for them. Starting with the string_cast function that acts as an interface:
template string_cast now takes 2 template parameters. Keeping your naming convention, I use
Ts to indicate the source type. (TO and FROM are probably better names.)Td string_cast(const Ts &source)
{We use type deduction to identify what we're casting from.
return string_cast_imp::cast(source);Once we know what type
Td and Ts is we delegate to string_cast_imp and the appropriate template will be instantiated.}Let's handle the easy case first:
template
struct string_cast_imp
{
static const Td& cast(const Td &source)
{
return source;For casting to the same string type, we don't need to do anything. Just return back what was given. Since this is nothing more than a pass-through returning by reference is ok.
string_cast will make a copy before going out of scope since it's return by value.}
};Now for the important case, the reason for writing
string_cast in the first place! The basic process is the same, only certain aspects are different:- conversion function used. eg.
WideCharToMultiBytevsMultiByteToWideChar
- buffer type used. eg.
vectorfor string vsvectorfor wstring
- string type returned. That's captured by our template parameter
Tdso we don't have to worry about this as much.
You can extract those differences into a trait-like policy class.
template
struct string_cast_imp
{
static Td cast(const Ts &source)
{
int length = string_traits::byte_convert( CP_ACP, source.data(), source.length(),
NULL, 0 );
if( length == 0 )
{
return Td();
}Here I removed the
string.empty() check since it's not really needed. If the string is empty length will be 0 anyway so this is properly handled.vector::char_trait > buffer( length );Here we use our policy class to tell us the proper character-type to use for our buffer. If
Td = string then string_traits::char_trait will be a char. If it's a wstring then string_traits::char_trait will evaluate to a wchar_t.string_traits::byte_convert( CP_ACP, source.data(), source.length(),
&buffer[ 0 ] , length );Similiarly,
byte_convert acts as a wrapper to the correct byte function to call. This attrib is captured by our policy class as well.return Td( buffer.begin(), buffer.end() );
}
};We define our string_traits policies like this:
template
struct string_traits;Declare the general base-form but don't define it. This way if code tries to cast from an illegit string-type it will give a compile error.
template <>
struct string_traits
{
typedef char char_trait;
static int byte_convert(const int codepage, LPCSTR data , int data_length,
LPWSTR buffer, int buffer_size)
{You might want to play around with the parameters it accepts but this should give you the general idea.
return ::MultiByteToWideChar( codepage, 0, data, data_length, buffer, buffer_size );
}
};And now for the last case. For raw pointer types we can just wrap it into an appropriate string type and call one of our above string functions. We have to overload
string_cast here because our base form accepts a reference type. Since reference types to arrays do not decay into a pointer type, this second template form will specifically handle that case for us.template
Td string_cast(Ts *source)
{
return string_cast_imp::wrap >::cast(source);Notice I'm using
const Ts * as template parameter for string_type_of. Regardless of whether Ts is const or not we always use template form `` to get thCode Snippets
template <typename Td, typename Ts>Td string_cast(const Ts &source)
{return string_cast_imp<Td, Ts>::cast(source);template <typename Td>
struct string_cast_imp<Td, Td>
{
static const Td& cast(const Td &source)
{
return source;template <typename Td, typename Ts>
struct string_cast_imp
{
static Td cast(const Ts &source)
{
int length = string_traits<Ts>::byte_convert( CP_ACP, source.data(), source.length(),
NULL, 0 );
if( length == 0 )
{
return Td();
}Context
StackExchange Code Review Q#1205, answer score: 17
Revisions (0)
No revisions yet.