patterncppMinor
Function that normalizes a vector of double
Viewed 0 times
normalizesfunctiondoublethatvector
Problem
I have written the following function in order to normalize a container of double, so that it sums up to 1.0:
Is it possible to do better, or having a more precise result?
/**
* @brief This function normalizes a container so that it sums to 1.0.
*
* @param begin The beginning of the range to normalize.
* @param end The end of the range to normalize.
* @param out The beginning of the output range (can be the same as begin).
*/
template
void normalizeProbability(InputIterator begin, InputIterator end, OutputIterator out) {
double norm = static_cast(std::accumulate(begin, end, 0.0));
std::transform(begin, end, out, [norm](decltype(*begin) t){ return t/norm; });
}Is it possible to do better, or having a more precise result?
Solution
It looks pretty good as it is but I can offer the following:
I have removed the cast to double, which isn't necessary. The double return type is deduced by the
The biggest source of error you have is in the summation of
However I have no idea if this level of accuracy is necessary for you, but you stated precision as a question so I went for it. :)
template
void normalizeProbability(InputIterator begin, InputIterator end, OutputIterator out) {
typedef typename std::iterator_traits::reference ref_t;
double norm = std::accumulate(begin, end, 0.0);
std::transform(begin, end, out, [norm](const ref_t t){ return t/norm; });
}I have removed the cast to double, which isn't necessary. The double return type is deduced by the
0.0 argument. I also changed to use std::iterator_traits instead of decltype which I think is a bit cleaner.The biggest source of error you have is in the summation of
norm. Adding a small floating point (FP) number (the dereferenced value) to a large floating point number (the current sum) will cause a truncation error. For example 1.0 + 1E-18 == 1.0 will return true due to the limited precision of FP-numbers. What you want to do is to add the numbers from the smallest to the biggest. But simply sorting isn't good enough, consider 10^20 entries (yeah I know fat chance but it's just for illustration) each with value 1E-18 you expect the norm to be 10^2, but in fact it will be close to 1.0 or less. Because after about 1E18 iterations the running sum will be so big that any remaining elements will truncate. So you need to have a heap, take the two smallest numbers, add them and put the result back into the heap. Then repeat until only one number is left, which will be the sum. This guarantees that you will always minimize the truncation error.However I have no idea if this level of accuracy is necessary for you, but you stated precision as a question so I went for it. :)
Code Snippets
template <typename InputIterator, typename OutputIterator>
void normalizeProbability(InputIterator begin, InputIterator end, OutputIterator out) {
typedef typename std::iterator_traits<InputIterator>::reference ref_t;
double norm = std::accumulate(begin, end, 0.0);
std::transform(begin, end, out, [norm](const ref_t t){ return t/norm; });
}Context
StackExchange Code Review Q#66892, answer score: 6
Revisions (0)
No revisions yet.