HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Review of a C++ missing characters snippet

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
characterssnippetreviewmissing

Problem

I recently had to write a function to return characters missing from an input string. Here is the snippet

#include 
#include 

template
std::string getMissingLetters(Iterator begin, Iterator end) {

    std::bitset letters;

    for (auto it = begin; it != end; ++it) {
        size_t c = tolower(*it) - char('a'); // normalized to start at 'a'
        if (c > 26) continue;
        letters.set(c);
        if (letters.to_ulong() == ((1 << 26) - 1)) break;  // all 26 bits set
    }

    std::string missing;
    for (int i = 0; i < letters.size(); ++i) {
        if (!letters[i]) missing.push_back(char('a' + i));
    }
    return missing;
}


I thought of couple things to improve, for example avoiding branch by clamping characters, eg.

letters.set(std::min(c, 26));


and perhaps blocking the loop s.t. the line

if (letters.to_ulong() == ((1 << 26) - 1)) break;


is executed every N iterations rather than each iteration.

What do you think one could have done better?

Solution

Instead of starting from a zero-ed bitset and then set them to true one by one, you could start with all true, set to false one by one and then use std::bitset::none() to break the loop.

But, apart from this small thing, std::string will probably preallocate memory (on the heap or on the stack; it's implementation dependent), whether or not you add a letter. My advice would be to use an std::array with a fixed size of 26 (or your own stack implementation of a string) to directly store the letters, and remove the bitset and std::string.

That should simplified the code. As the 26 chars will be stored in 7 int, copying or overwriting that size from the stack should be blazing fast (the cache lines of a modern processor is 32 bytes) compared to multiple math operations and a potential heap allocation.

Context

StackExchange Code Review Q#30851, answer score: 2

Revisions (0)

No revisions yet.