patterncppMinor
Counting number of unique elements in a sorted container
Viewed 0 times
uniquenumbercountingcontainerelementssorted
Problem
Similar to this question, but in C++. Although my priority is performance (and correctness!) rather than readability.
#include
#include
#include
template
size_t count_unique(const Container& elements)
{
size_t result {};
for (auto it = std::cbegin(elements), last = std::cend(elements); it != last;) {
++result;
it = std::upper_bound(it, last, *it);
}
return result;
}Solution
Worst case is \$O(n \log n)\$
If we pass an array of completely unique elements, we get a pretty bad runtime. Your solution is good if we expect a lot of duplicates, but bad if we don't.
As an optimization for the worst case, we could add some exponential bounding to find the next element. Basically, replace
with:
Of course is now pretty bad for non-random access iterators since we have to keep doing lots of distance and advance related calls, so I would simply default to the trivial comparison algorithm in that case. That is:
Additional Requirements
The binary-search optimization requires that the type the container holds is ordered. You may want to fallback to the simple forward-iterator version if the type is only equality-comparable but not less-than-comparable.
Initialization
I would prefer:
to
No reason to hide the
If we pass an array of completely unique elements, we get a pretty bad runtime. Your solution is good if we expect a lot of duplicates, but bad if we don't.
As an optimization for the worst case, we could add some exponential bounding to find the next element. Basically, replace
it = std::upper_bound(it, last, *it);with:
std::ptrdiff_t bound = 1;
while (std::distance(it, last) < bound && *std::next(it, bound) == *it) {
bound *= 2;
}
it = std::upper_bound(
std::next(it, bound/2),
std::next(it, std::min(bound, std::distance(it, last))),
*it);Of course is now pretty bad for non-random access iterators since we have to keep doing lots of distance and advance related calls, so I would simply default to the trivial comparison algorithm in that case. That is:
template
size_t count_unique(C const& container) {
using It = decltype(std::cbegin(container));
return count_unique(container, std::iterator_traits::iterator_category{});
}
template
size_t count_unique(C const& container, std::random_access_iterator_tag ) {
// what I suggested
}
template
size_t count_unique(C const& container, std::forward_iterator_tag ) {
auto first = std::cbegin(container), last = std::cend(container);
if (first == last) return 0;
size_t count = 1;
for (auto next = std::next(first); next != last; ++first, ++next)
{
count += !(*next == *first);
}
return count;
}Additional Requirements
The binary-search optimization requires that the type the container holds is ordered. You may want to fallback to the simple forward-iterator version if the type is only equality-comparable but not less-than-comparable.
Initialization
I would prefer:
size_t result = 0;to
size_t result {};No reason to hide the
0.Code Snippets
it = std::upper_bound(it, last, *it);std::ptrdiff_t bound = 1;
while (std::distance(it, last) < bound && *std::next(it, bound) == *it) {
bound *= 2;
}
it = std::upper_bound(
std::next(it, bound/2),
std::next(it, std::min(bound, std::distance(it, last))),
*it);template <class C>
size_t count_unique(C const& container) {
using It = decltype(std::cbegin(container));
return count_unique(container, std::iterator_traits<It>::iterator_category{});
}
template <class C>
size_t count_unique(C const& container, std::random_access_iterator_tag ) {
// what I suggested
}
template <class C>
size_t count_unique(C const& container, std::forward_iterator_tag ) {
auto first = std::cbegin(container), last = std::cend(container);
if (first == last) return 0;
size_t count = 1;
for (auto next = std::next(first); next != last; ++first, ++next)
{
count += !(*next == *first);
}
return count;
}size_t result = 0;size_t result {};Context
StackExchange Code Review Q#116229, answer score: 8
Revisions (0)
No revisions yet.