HiveBrain v1.2.0
Get Started
← Back to all entries
snippetcppModerate

Implement an algorithm to determine if a string has all unique characters

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
uniqueimplementallalgorithmhasdeterminecharactersstring

Problem

I'm working through Cracking the Coding Interview. The first challenge is to implement an algorithm to determine if a string has all unique characters.

I've gone through several iterations, but here's my "final" solution. I've made several assumptions:

  • We are considering case-insensitive strings.



  • We are only considering strings that contain alphabetic characters.



  • We are only considering ASCII strings.



Here's my algorithm and a quick driver. The logic is simple, but I wanted to explain what's going on for sake of the reader.

/*
Implement an algorithm to determine if a string has all unique characters.
What if you can not use additional data structures?
*/

#include 
#include 

bool is_unique(const std::string& s) {
    // seen is a 32-bit bitset. We have seen no letters yet.
    int seen = 0;
    // for every character in the string
    for(auto c: s) {
        // convert the string to lowercase (assumption 1)
        char lower_c = std::tolower(c);
        // normalize the character (assumptions 2 and 3)
        int offset = lower_c - 'a';
        // shift 1 at most 26 to the left. Fits within 32-bit bitset.
        int shifted = 1 << offset;

        if(!(seen & shifted)) {
            // if we haven't seen it, add it to seen
            seen |= shifted;
        } else {
            // otherwise return false -- we have seen it
            return false;
        }
    }
    // if we haven't returned false after the loop, it must be unique
    return true;
}

int main(int argc, char** argv) {
    std::string s{argv[1]};
    std::cout << "Is " << s << " unique? " << is_unique(s) << '\n';
}


Does anyone have recommendations?

Solution

If I'd asked an interview question, and got this as an answer (at least as the first answer) I think my immediate reaction would be to consider it excessively clever.

To make much sense, along with the conditions you gave, I'd have had to specify that minimizing memory usage was absolutely crucial.

Under ordinary circumstances, I'd expect an answer something like this:

bool all_unique(std::string const &input) { 
    std::map temp(input.begin(), input.end());
    return temp.size == input.size();
}


...or something like this:

bool all_unique(std::string input) { 
    std::sort(input.begin(), input.end());
    auto p = std::unique(input.begin(), input.end());
    return p == input.end();
}


There are, of course, variations on these themes (e.g., perfectly reasonable and acceptable to use unordered_set instead of set).

Either of these will work with a much wider range of inputs than you've specified. They will use more storage, but the amount they use is still too small to matter under any normal circumstances.

If somebody were going to craft a special-purpose set of their own, I'd normally expect them to start from std::bitset or std::vector. I guess if they insisted on starting with a native type to represent their bits, I wouldn't object too terribly much, but I'd be a little disappointed if they chose int for that type.

  • If you want a bag of bits, you should usually use an unsigned type.



  • If you want at least 32 bits, you should use a type that guarantees at least 32 bits (int only guarantees at least 16 bits).



As such, obvious choices would be unsigned long and uint32_t.

Code Snippets

bool all_unique(std::string const &input) { 
    std::map<char> temp(input.begin(), input.end());
    return temp.size == input.size();
}
bool all_unique(std::string input) { 
    std::sort(input.begin(), input.end());
    auto p = std::unique(input.begin(), input.end());
    return p == input.end();
}

Context

StackExchange Code Review Q#134450, answer score: 14

Revisions (0)

No revisions yet.