HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Removing duplicate characters from a string

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
removingduplicatecharactersfromstring

Problem

I implemented a function removeDuplicates which takes 1 parameter, a std::string, and returns a std::string without the duplicate characters.

Example I/O might look like this:

removeDuplicates("hello"); //--> "helo"
removeDuplicates("this is easy"); //--> "this eay"


Here is the function:

std::string removeDuplicates(const std::string& str)
{
    //Map for storing character count
    std::map ctable{};

    //Count characters
    std::for_each(str.begin(), str.end(), [&ctable](const auto& value) {
        ++ctable[value];
    });

    std::string newstring = "";
    newstring.resize(str.size());

    //Only copy one character of each type to the new string
    std::copy_if(str.begin(), str.end(), newstring.begin(), [&ctable](const auto& value) {
        return ctable.erase(value) > 0;
    });

    //newstring is bigger than necessary if it has duplicates
    newstring.shrink_to_fit();

    return newstring;
}


Is there a more efficient way to do it?

And what can be improved about the actual code of the function (like variable names/comments, ...).

Solution

Design.

Do you actually need to count all the characters?

I would just record that I had already moved them as I moved them from source to destination.

std::set  moved;
std::copy_if(str.begin(), str.end(), newstring.begin(), [&moved](const auto& value) {
    return moved.insert(value).second;
});


Comments

To be honest I don't find your comments objectionable. Maybe slightly too many but not egregiously so.

Remember do not make your comments describe the code. If I can read the code the I will not need the comments. You should use comments to describe WHY and the code shows HOW.

// Loop over the array
 for_each(array.begin(), array.end() ,


That's a bad comment as I can read the code and see that it loops over the array I don't need the comment to tell me that. Bad comments are worse than no comments. So best to write Self Documenting Code (this just means your variables and function names should be descriptive so I can read the code and understand what is happening).

Code Review.

Don't need the {} to default construct an empty table.

std::map ctable{};


Rather than std::for_each I would use the new range based for.

std::for_each(str.begin(), str.end(), [&ctable](const auto& value) {
        ++ctable[value];
    });

    // try
    for(auto value: str) {
        ++ctable[value];
    }


I would avoid the resize() and prefer reserve() in this situation.

std::string newstring = "";
    newstring.resize(str.size());


Then when inserting into newstring I would use a back inserter. The reserve() makes sure that you will never need to reallocate the space but you don't make the string too long. Then use the back inserter to increase the size of the string as needed. At the end you don't need to shrink to fit.

std::copy_if(str.begin(), str.end(), std::back_inserter(newstring), [&ctable](const auto& value) {
        return ctable.erase(value) > 0;
    });


The new version of the standard has added std::begin() and std::end(). Prefer to use these rather than the member functions. This will help when your code is templatized and somebody passes an array rather than a standard container.

std::string removeDuplicates(const std::string& str)
{
    std::set moved;    
    std::string    result;
    result.reserve(str.size());

    //Only copy one character of each type to the new string
    std::copy_if(std::begin(str), std::end(str),
                 std::back_inserter(result),
                 [&moved](const auto& value) {return moved.insert(value).second;}
                );

    return newstring;
}

Code Snippets

std::set<char>  moved;
std::copy_if(str.begin(), str.end(), newstring.begin(), [&moved](const auto& value) {
    return moved.insert(value).second;
});
// Loop over the array
 for_each(array.begin(), array.end() ,
std::map<char, unsigned int> ctable{};
std::for_each(str.begin(), str.end(), [&ctable](const auto& value) {
        ++ctable[value];
    });

    // try
    for(auto value: str) {
        ++ctable[value];
    }
std::string newstring = "";
    newstring.resize(str.size());

Context

StackExchange Code Review Q#126756, answer score: 7

Revisions (0)

No revisions yet.