patterncppModerate
Dividing a range into N sub-ranges
Viewed 0 times
rangeintosubrangesdividing
Problem
Introduction
The function divides a range given by a
Example
Consider a set of size 10:
\$\{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 \}\$
The output would be is a set of
\$\{ \{ 0, 3 \}, \{ 4, 6 \}, \{ 7, 9 \} \}\$
Where each sub-set is as even as possible, in the previous example the sets have sizes 4, 3, and 3, respectively; that is the most even way to distribute 10 elements into 3 sub-sets.
Review goals
Implementation
The oringinal code contained an issue which I missed by pure luck. Here is the fixed working code; the changes do not invalidate the present answer.
```
#include // std::vector
#include // std::distance(), std::next()
template
decltype( auto ) divide_work( Iterator begin, Iterator end, std::size_t n )
{
using size_type = std::size_t;
auto sz = std::distance( begin, end );
auto element_per_group_count = sz / n;
auto remaining_element_count = sz % n;
std::vector> ranges;
ranges.reserve( n );
if ( remaining_element_count == 0 )
{
--element_per_group_count;
ranges.emplace_back( std::make_pair( std::next( begin, 0 ),
std::next( begin, element_per_group_count ) ) );
for ( auto i = 1U; i < n; ++i )
{
auto first = std::next( ranges[ i - 1 ].second, 1 );
ranges.emplace_back( std::make_pair( first,
std::next( first, element_per_group_count ) ) );
}
}
else
{
ranges.emplace_back( std::make_pair( std::next( begin, 0 ),
std::next( begin, element_per_group_count ) ) );
for ( auto i = 1U; i < remaining_element_count; ++i )
{
auto first = std::next( ranges[ i - 1 ].second, 1 );
ranges.emplace_back( std::make_pair( first,
std::next( first, element_per_group
The function divides a range given by a
begin_iterator and an end_iterator into n sub-ranges of the same format, where the sub-ranges must be as equally sized as possible.Example
Consider a set of size 10:
\$\{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 \}\$
The output would be is a set of
n = 3 sub-ranges:\$\{ \{ 0, 3 \}, \{ 4, 6 \}, \{ 7, 9 \} \}\$
Where each sub-set is as even as possible, in the previous example the sets have sizes 4, 3, and 3, respectively; that is the most even way to distribute 10 elements into 3 sub-sets.
Review goals
- Efficiency
- Correctness
Implementation
The oringinal code contained an issue which I missed by pure luck. Here is the fixed working code; the changes do not invalidate the present answer.
```
#include // std::vector
#include // std::distance(), std::next()
template
decltype( auto ) divide_work( Iterator begin, Iterator end, std::size_t n )
{
using size_type = std::size_t;
auto sz = std::distance( begin, end );
auto element_per_group_count = sz / n;
auto remaining_element_count = sz % n;
std::vector> ranges;
ranges.reserve( n );
if ( remaining_element_count == 0 )
{
--element_per_group_count;
ranges.emplace_back( std::make_pair( std::next( begin, 0 ),
std::next( begin, element_per_group_count ) ) );
for ( auto i = 1U; i < n; ++i )
{
auto first = std::next( ranges[ i - 1 ].second, 1 );
ranges.emplace_back( std::make_pair( first,
std::next( first, element_per_group_count ) ) );
}
}
else
{
ranges.emplace_back( std::make_pair( std::next( begin, 0 ),
std::next( begin, element_per_group_count ) ) );
for ( auto i = 1U; i < remaining_element_count; ++i )
{
auto first = std::next( ranges[ i - 1 ].second, 1 );
ranges.emplace_back( std::make_pair( first,
std::next( first, element_per_group
Solution
Misleading Return Type
You have:
The difference between
Even better, the return type for this function is not a complicated expression that's difficult to spell. It's simply
emplace_back
The advantage of
That's inefficient (you're copy constructing a pair instead of in-place constructing it) and just a bunch of extra typing. Also
Overly Complicated Algorithm!
I'm not sure why you have three for loops or why even division is a special case of the algorithm... Also I'm pretty sure that special case is wrong. Consider a range of size 15 dividing into 3 chunks, so it divides evenly. We'd start with
Then the next two ranges will be:
We successfully divided into three chunks, but we're actually missing elements! 4, 9, and 14 will not appear in any of the three chunks. The correct division would be
We're just chunking the range
This is all you need:
Dividing up the remainder one at a time means that if we have 15 elements into 6 chunks, we end up with sizes of [3, 3, 3, 2, 2, 2] instead of any other weird alternative like the naive [2, 2, 2, 2, 2, 5].
Too many chunks? None?
What do you do in the case where
You have:
template
decltype( auto ) divide_work( Iterator begin, Iterator end, std::size_t n )The difference between
auto and decltype(auto) is that the latter can deduce reference types as well. But you're returning a value, always, regardless of Iterator. This choice makes it seem as if you could be returning a reference. If you are going to use a placeholder return type here, use simply auto.Even better, the return type for this function is not a complicated expression that's difficult to spell. It's simply
std::vector>, so let's just use that:template
std::vector>
divide_work( Iterator begin, Iterator end, std::size_t n )emplace_back
The advantage of
emplace_back is not having to actually construct the object to pass in. So when you write:ranges.emplace_back( std::make_pair( std::next( begin, 0 ),
std::next( begin, element_per_group_count ) ) );That's inefficient (you're copy constructing a pair instead of in-place constructing it) and just a bunch of extra typing. Also
next(iter, 0) is iter, so the above line is logically exactly equivalent to:ranges.emplace_back(begin, std::next(begin, element_per_group_count));Overly Complicated Algorithm!
I'm not sure why you have three for loops or why even division is a special case of the algorithm... Also I'm pretty sure that special case is wrong. Consider a range of size 15 dividing into 3 chunks, so it divides evenly. We'd start with
element_per_group_count == 5, but then we decrement it to 4, and add the first range of:[0, 4)Then the next two ranges will be:
[5, 9)
[10, 14)We successfully divided into three chunks, but we're actually missing elements! 4, 9, and 14 will not appear in any of the three chunks. The correct division would be
[[0, 5), [5, 10), [10, 15)]. We're just chunking the range
(begin, end) into n pieces. That implies a simple for-loop. We need n things. Each one is approximately of size distance(begin, end)/n. There will be distance(begin, end)%n remainder, which we will allocate one at a time as we go until we run out of remainder. When we already added n-1 things, we just can add the rest as the last chunk (since there's nothing else to do at that point). This is all you need:
template
std::vector>
divide_work( Iterator begin, Iterator end, std::size_t n )
{
std::vector> ranges;
ranges.reserve(n);
auto dist = std::distance(begin, end);
auto chunk = dist / n;
auto remainder = dist % n;
for (size_t i = 0; i < n-1; ++i) {
auto next_end = std::next(begin, chunk + (remainder ? 1 : 0));
ranges.emplace_back(begin, next_end);
begin = next_end;
if (remainder) remainder -= 1;
}
// last chunk
ranges.emplace_back(begin, end);
return ranges;
}Dividing up the remainder one at a time means that if we have 15 elements into 6 chunks, we end up with sizes of [3, 3, 3, 2, 2, 2] instead of any other weird alternative like the naive [2, 2, 2, 2, 2, 5].
Too many chunks? None?
What do you do in the case where
n > distance(begin, end) or n == 0? I guess we should probably add some error checks:template
std::vector>
divide_work( Iterator begin, Iterator end, std::size_t n )
{
std::vector> ranges;
if (n == 0) return ranges;
ranges.reserve(n);
auto dist = std::distance(begin, end);
n = std::min(n, dist);
auto chunk = dist / n;
auto remainder = dist % n;
for (size_t i = 0; i < n-1; ++i) {
auto next_end = std::next(begin, chunk + (remainder ? 1 : 0));
ranges.emplace_back(begin, next_end);
begin = next_end;
if (remainder) remainder -= 1;
}
// last chunk
ranges.emplace_back(begin, end);
return ranges;
}Code Snippets
template <typename Iterator>
decltype( auto ) divide_work( Iterator begin, Iterator end, std::size_t n )template <typename Iterator>
std::vector<std::pair<Iterator, Iterator>>
divide_work( Iterator begin, Iterator end, std::size_t n )ranges.emplace_back( std::make_pair( std::next( begin, 0 ),
std::next( begin, element_per_group_count ) ) );ranges.emplace_back(begin, std::next(begin, element_per_group_count));[5, 9)
[10, 14)Context
StackExchange Code Review Q#106773, answer score: 10
Revisions (0)
No revisions yet.