patterncppMinor
Simple parsing for string formatting
Viewed 0 times
formattingsimpleparsingforstring
Problem
Recently I submitted some string formatting and printing code (found here) that I wrote as an small exercise. It was implemented naively using string replacement. This time around I wanted to actually parse the format string. This is the first time I've ever attempted to write code which does any type of parsing. I'm not 100% sure that this code does not contain any critical flaws. In addition to pointing out any such oversights, I would appreciate advice on how this could be modified to make it either faster or more readable.
The functionality is supposed to be similar to the
```
#include
#include
#include
#include
#include
#include
template
std::string to_string(T&& item)
{
return boost::lexical_cast(std::forward(item));
}
template
std::string format(const std::string& fmt, Args&&... args)
{
std::string output;
output.reserve(fmt.length() * 2);
std::string arg_strings[] = { to_string(std::forward(args))... };
bool placeholder_mode = false;
std::string index_string;
for (auto it = fmt.begin(); it != fmt.end(); ++it) {
//check for end of brace enclosed placeholder
if ((*it == '}') && (placeholder_mode == true)) {
if ((index_string.length() == 0)) {
throw std::invalid_argument("Invalid format string.");
}
//check if index_string contains a leading zero
if ((index_string.length() > 1) && (index_string[0] == '0')) {
throw std::invalid_argument("Invalid format string.");
}
int index = std::stoi(index_string);
if (index > (sizeof...(args) - 1)) {
throw std::out_of_range(
"Index (zero based) must be greater than or equal "
"to zero and less than the size of the argument list."
);
}
output += arg_strings[index];
index_string.clear(
The functionality is supposed to be similar to the
Console.WriteLine or String.Format functions of C#.```
#include
#include
#include
#include
#include
#include
template
std::string to_string(T&& item)
{
return boost::lexical_cast(std::forward(item));
}
template
std::string format(const std::string& fmt, Args&&... args)
{
std::string output;
output.reserve(fmt.length() * 2);
std::string arg_strings[] = { to_string(std::forward(args))... };
bool placeholder_mode = false;
std::string index_string;
for (auto it = fmt.begin(); it != fmt.end(); ++it) {
//check for end of brace enclosed placeholder
if ((*it == '}') && (placeholder_mode == true)) {
if ((index_string.length() == 0)) {
throw std::invalid_argument("Invalid format string.");
}
//check if index_string contains a leading zero
if ((index_string.length() > 1) && (index_string[0] == '0')) {
throw std::invalid_argument("Invalid format string.");
}
int index = std::stoi(index_string);
if (index > (sizeof...(args) - 1)) {
throw std::out_of_range(
"Index (zero based) must be greater than or equal "
"to zero and less than the size of the argument list."
);
}
output += arg_strings[index];
index_string.clear(
Solution
There are a few points about this that seem open to improvement (at least to me).
-
Don't compare
becomes:
likewise:
becomes:
-
You have basically everything in one giant function. This makes the code more difficult to understand, analyze, etc.
-
Some of the checks don't seem to accomplish much (if anything)--preventing leading zeros, for one obvious example.
-
Like most parsers, this is basically a state machine. As in many cases, it would probably be somewhat clearer if the state machine were a little more explicit.
I'd probably start with something like this:
...and so on. This structure can help clarify what happens in what states, and the conditions under which you progress from one state to another.
-
Don't compare
bools to true or false, just use their value directly, so placeholder_mode == falsebecomes:
!placeholder_modelikewise:
if(placeholder_mode == true)becomes:
if (placeholder_mode)-
You have basically everything in one giant function. This makes the code more difficult to understand, analyze, etc.
-
Some of the checks don't seem to accomplish much (if anything)--preventing leading zeros, for one obvious example.
-
Like most parsers, this is basically a state machine. As in many cases, it would probably be somewhat clearer if the state machine were a little more explicit.
I'd probably start with something like this:
enum states { COPYING, START_PLACEHOLDER, READ_PLACEHOLDER, END_PLACEHOLDER, ERROR };
states state;
while (it != fmt.end()) {
switch (state) {
case COPYING:
if (*it == '{') state = START_PLACEHOLDER;
else output.push_back(*it);
break;
case START_PLACEHOLDER:
if (*it == '0') state = ERROR;
else state = READ_PLACEHOLDER;
// intentional fall-through
case READ_PLACEHOLDER:
num = std::stoi(it);
check_range(num, 0, arg_count);
output.push_back(args[num]);...and so on. This structure can help clarify what happens in what states, and the conditions under which you progress from one state to another.
Code Snippets
placeholder_mode == false!placeholder_modeif(placeholder_mode == true)if (placeholder_mode)enum states { COPYING, START_PLACEHOLDER, READ_PLACEHOLDER, END_PLACEHOLDER, ERROR };
states state;
while (it != fmt.end()) {
switch (state) {
case COPYING:
if (*it == '{') state = START_PLACEHOLDER;
else output.push_back(*it);
break;
case START_PLACEHOLDER:
if (*it == '0') state = ERROR;
else state = READ_PLACEHOLDER;
// intentional fall-through
case READ_PLACEHOLDER:
num = std::stoi(it);
check_range(num, 0, arg_count);
output.push_back(args[num]);Context
StackExchange Code Review Q#56900, answer score: 3
Revisions (0)
No revisions yet.