HiveBrain v1.2.0
Get Started
← Back to all entries
patternModerate

Why is adding log probabilities faster than multiplying probabilities?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
whylogmultiplyingthanaddingfasterprobabilities

Problem

To frame the question, in computer science often we want to calculate the product of several probabilities:

P(A,B,C) = P(A) * P(B) * P(C)


The simplest approach is simply to multiply these numbers, and that is what I was going to do. However, my boss said it's better to add the log of the probabilities:

log(P(A,B,C)) = log(P(A)) + log(P(B)) + log(P(C))


This gives the log probability, but we can get the probability afterwards if necessary:

P(A,B,C) = e^log(P(A,B,C))


Log addition is considered better for two reasons:

  • It prevents "underflow" whereby the product of probabilities is so small that it gets rounded to zero. This can often be a risk since probabilities are often very small.



  • It is faster because many computer architectures can perform addition more quickly than multiplication.



My question is about the second point. This is how I have seen it described, but it doesn't take into account the added cost of getting the log! We should be comparing "cost of log + cost of addition" to "cost of multiplication". Is it still smaller after taking that into account?

Also, the Wikipedia page (Log probability) is confusing in this respect, stating "The conversion to log form is expensive, but is only incurred once." I don't understand this, because I think you would need to take the log of every term independently before adding. What am I missing?

Finally, the justification that "computers perform addition faster than multiplication" is kind of vague. Is that specific to the x86 instruction set, or is it some more fundamental trait of processor architectures?

Solution

Also, the Wikipedia page (https://en.wikipedia.org/wiki/Log_probability) is confusing in this respect, stating "The conversion to log form is expensive, but is only incurred once." I don't understand this, because I think you would need to take the log of every term independently before adding. What am I missing?

If you just want to compute $P(A_1)\ldots P(A_n)$ once, then you are right. You will have to compute $n$ logarithms and $n-1$ additions, whereas the naive method requires $n-1$ multiplications.

However, it is very common that you want to answer queries of the form:


Compute $\prod_{i \in I} P(A_i)$ for some subset $I$ of $\{1, \ldots n\}$.

In that case, you can preprocess your data to compute all $\log P(A_i)$ only once, and answer each query by doing $|I|$ additions.


Finally, the justification that "computers perform addition faster than multiplication" is kind of vague. Is that specific to the x86 instruction set, or is it some more fundamental trait of processor architectures?

This is a broader question. In general it is (probably?) harder to compute multiplication than addition. Computing $a+b$ is linear in the size of $a$ and $b$ (using the trivial algorithm), whereas we currently don't know how to compute $a\times b$ with the same time complexity (check the best algorithms here).

Of course there is no definitive answer: for instance if you deal with integers only and you multiply by powers of $2$, then you should rather compare shift with add operations.

Nevertheless this is a reasonable statement on all common computer architectures: multiplication on floating-point numbers will be slower than addition.

Context

StackExchange Computer Science Q#77135, answer score: 17

Revisions (0)

No revisions yet.