debugcppMinor
Benchmark a sorting method for fixed-size arrays
Viewed 0 times
sortingarraysmethodsizeforbenchmarkfixed
Problem
During the past few days, I have written a small C++14 library which mainly provides the
While I am rather confident that the new sorting method achieves its goal, I still wrote a benchmarking program to compare
The program generates shuffled arrays for every small array size (\$0\$ to \$64\$ in our case) and sorts them one million times per algorithm and returns series of integers to the standard output which correspond to the execution times of the algorithms in milliseconds (one line corresponds to the time of one algorithm, where the \$n\$th integer in the line corresponds to the time it took to sort one million of shuffled arrays of size \$n\$).
```
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
template
>
auto std_sort(RandomAccessIterable& iterable, Compare&& compare={})
-> void
{
std::sort(std::begin(iterable), std::end(iterable), std::forward(compare));
}
template
auto time_compare(SortFunction1 sort1, SortFunction2 sort2, std::size_t times)
-> std::array
{
// Random numbers generator
thread_local std::mt19937_64 engine(std::time(nullptr));
// Generate shuffled array, the same for both algorithms
std::array array;
std::iota(std::begin(array), std::end(array), 0);
std::shuffle(std::begin(array), std::end(array), engine);
// Time first algorithm
auto start = std::chrono::high_resolution_clock::now();
for (std::size_t i = 0 ; i {});
}
auto end = std::chrono::high_resolution_clock::now();
auto duration1 = std::chrono::duration_cast(end - start
cppsort::sort function. It is designed to efficiently sort small arrays when their time is known at compile time, and to fall back to the standard library std::sort otherwise.While I am rather confident that the new sorting method achieves its goal, I still wrote a benchmarking program to compare
cppsort::sort and std::sort. Since I am not used to writing benchmarks, I would like this benchmarking method reviewed in order to know whether there are bias which might affect the results. Any kind of review is welcome :)The program generates shuffled arrays for every small array size (\$0\$ to \$64\$ in our case) and sorts them one million times per algorithm and returns series of integers to the standard output which correspond to the execution times of the algorithms in milliseconds (one line corresponds to the time of one algorithm, where the \$n\$th integer in the line corresponds to the time it took to sort one million of shuffled arrays of size \$n\$).
```
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
template
>
auto std_sort(RandomAccessIterable& iterable, Compare&& compare={})
-> void
{
std::sort(std::begin(iterable), std::end(iterable), std::forward(compare));
}
template
auto time_compare(SortFunction1 sort1, SortFunction2 sort2, std::size_t times)
-> std::array
{
// Random numbers generator
thread_local std::mt19937_64 engine(std::time(nullptr));
// Generate shuffled array, the same for both algorithms
std::array array;
std::iota(std::begin(array), std::end(array), 0);
std::shuffle(std::begin(array), std::end(array), engine);
// Time first algorithm
auto start = std::chrono::high_resolution_clock::now();
for (std::size_t i = 0 ; i {});
}
auto end = std::chrono::high_resolution_clock::now();
auto duration1 = std::chrono::duration_cast(end - start
Solution
-
Your code never even looks at the sorted array. An all to smart compiler could figure out that the whole sorting is useless and optimize it out. You could avoid this, for example, by assigning a randomly picked element of the sorted array to a
-
You are sorting the same permutation over and over again. You'd get more representative results if you'd use different permutations. This is important especially since the run-time of some sorting algorithms depends on the inputs. Again, a very smart compiler might also figure out that sorting the same data again is pointless. Since the expected run-time of sorting such small arrays is tiny, it might seem reasonable to time a larger number of runs to become less dependent on the clock accuracy. You could, however, sort the same array again using a different comparator. For example, you could XOR a random value to the values before comparing them. Pick a different value each time you sort.
-
The second algorithm might have an unfair advantage since the data will already be cached. If you are worried about this, randomize the order in which the algorithms run.
-
It is implementation defined whether
-
Instead of only plotting the average values, compute mean and standard deviation and plot them as data points with error bars. Then overlay a weighted linear regression of α + β n log(n) (or whatever curve you deem appropriate). This will give you the desired “smoother” and more honest curve.
Your code never even looks at the sorted array. An all to smart compiler could figure out that the whole sorting is useless and optimize it out. You could avoid this, for example, by assigning a randomly picked element of the sorted array to a
volatile variable. Don't make the whole array volatile, however, or you'll lose a lot of useful optimizations.-
You are sorting the same permutation over and over again. You'd get more representative results if you'd use different permutations. This is important especially since the run-time of some sorting algorithms depends on the inputs. Again, a very smart compiler might also figure out that sorting the same data again is pointless. Since the expected run-time of sorting such small arrays is tiny, it might seem reasonable to time a larger number of runs to become less dependent on the clock accuracy. You could, however, sort the same array again using a different comparator. For example, you could XOR a random value to the values before comparing them. Pick a different value each time you sort.
-
The second algorithm might have an unfair advantage since the data will already be cached. If you are worried about this, randomize the order in which the algorithms run.
-
It is implementation defined whether
std::chrono::high_resolution_clock is a real-time clock or steady. If it is not steady and you have running an NTP daemon in the background, your timing results will be worthless. You could use std::chrono::steady_clock instead or statically select the best available clock at compile-time. In your case, it might also be worth consideration to use std::clock instead, if you are not interested in system effects.-
Instead of only plotting the average values, compute mean and standard deviation and plot them as data points with error bars. Then overlay a weighted linear regression of α + β n log(n) (or whatever curve you deem appropriate). This will give you the desired “smoother” and more honest curve.
Context
StackExchange Code Review Q#98883, answer score: 4
Revisions (0)
No revisions yet.