HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Find the most frequent element in a sequence without copying elements

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
withouttheelementssequencecopyingelementfindfrequentmost

Problem

Specification:


Given first, last (which are ForwardIterators, and whose std::iterator_trais::value_type is LessThanComparable), find the most frequent element in the sequence and return pair of iterator to the last occurrence of the element and frequency count. When using overload with comparator (which is Compare), the restriction on value type of iterators is lifted.

Usage guidelines:


Should be used when the elements in the sequence are non copyable or too expensive to copy. Should be avoided when entropy of the sequence is very high, most of the values in the sequence are distinct, size of value type of iterators is within range of a few integers and the sequence is very large.

Code:

#ifndef AREA51_ALGORITHM_HPP
#define AREA51_ALGORITHM_HPP

#include 
#include 
#include 
#include 

template 
std::pair most_frequent(ForwardIterator first,
                                                      ForwardIterator last, Comparator comparator)
{
    auto comp = [&comparator](const auto& lhs, const auto& rhs)
    {
        return comparator(lhs.get(), rhs.get());
    };

    std::map::value_type>,
            std::size_t, decltype(comp)> counts(comp);

    std::size_t frequency = 0;
    auto most_freq = first;
    while (first != last)
    {
        std::size_t current = ++counts[*first];
        if (current > frequency)
        {
            frequency = current;
            most_freq = first;
        }

        ++first;
    }

    return std::make_pair(most_freq, frequency);
}

template 
std::pair most_frequent(ForwardIterator first, ForwardIterator last)
{
    return most_frequent(first, last, std::less<>{});
}

#endif //AREA51_ALGORITHM_HPP


It took roughly 3 milliseconds to find the most frequent integer in sequence of 100'000 integers that varies from 0 to 100 (release build, still faster than human reaction). The benchmark was very simplistic, so in the real world scenario performance can be different. Some further twisting of input (st

Solution

Does not compile unless I add the move operators to integer.

struct integer
{
    int x;
    integer(int y): 
            x(y)
    {}  

    integer(const integer& other) = delete; //non copyable
    integer& operator=(const integer& other) = delete;
    // Added these
    integer(integer&&) = default;
    integer& operator=(integer&&) = default;
};


You don't need that lamda in the your main function. You just did not specify your comparison operator correctly.

return most_frequent(first, last, std::less<>{});

// Should be

return most_frequent(first, last, std::less::value_type>{});


Then you can remove:

auto comp = [&comparator](const auto& lhs, const auto& rhs)
{
    return comparator(lhs.get(), rhs.get());
};


and just use Comparator where you use comp

There is no need to have two versions of the function most_frequent just use default parameter values.

// This is not needed
// You can remove it.
template 
std::pair most_frequent(ForwardIterator first, ForwardIterator last)
{
    return most_frequent(first, last, std::less<>{});
}


Modify the declaration of the main function:

template ::value_type>>
                                                   //   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
std::pair most_frequent(
                                            ForwardIterator first,
                                            ForwardIterator last,
                                            Comparator comparator = Comparator())
                                                  //                ^^^^^^^^^^^^


The code executes in time faster than human reaction, so I think for the first version this should be enough.

I would hope so.

Your tests sets are relatively small. Haw long does it take to scan the whole library of congress would be a better test.
Final Version:

template ::value_type>>
std::pair most_frequent(ForwardIterator first,
                                                      ForwardIterator last, Comparator comparator = Comparator())
{   
    std::map::value_type>,
            std::size_t, Comparator> counts(comparator);
    
    std::size_t frequency = 0;
    auto most_freq = first;
    while (first != last)
    {   
        std::size_t current = ++counts[*first];
        if (current > frequency)
        {   
            frequency = current;
            most_freq = first;
        }
        
        ++first;
    }
    
    return std::make_pair(most_freq, frequency);
}

Code Snippets

struct integer
{
    int x;
    integer(int y): 
            x(y)
    {}  

    integer(const integer& other) = delete; //non copyable
    integer& operator=(const integer& other) = delete;
    // Added these
    integer(integer&&) = default;
    integer& operator=(integer&&) = default;
};
return most_frequent(first, last, std::less<>{});

// Should be

return most_frequent(first, last, std::less<typename std::iterator_traits<ForwardIterator>::value_type>{});
auto comp = [&comparator](const auto& lhs, const auto& rhs)
{
    return comparator(lhs.get(), rhs.get());
};
// This is not needed
// You can remove it.
template <typename ForwardIterator>
std::pair<ForwardIterator, std::size_t> most_frequent(ForwardIterator first, ForwardIterator last)
{
    return most_frequent(first, last, std::less<>{});
}
template <typename ForwardIterator, typename Comparator = std::less<typename std::iterator_traits<ForwardIterator>::value_type>>
                                                   //   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
std::pair<ForwardIterator, std::size_t> most_frequent(
                                            ForwardIterator first,
                                            ForwardIterator last,
                                            Comparator comparator = Comparator())
                                                  //                ^^^^^^^^^^^^

Context

StackExchange Code Review Q#157456, answer score: 4

Revisions (0)

No revisions yet.