HiveBrain v1.2.0
Get Started
← Back to all entries
debugMinor

Choosing error rates for probabilistic algorithms

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
errorratesprobabilisticalgorithmsforchoosing

Problem

Probabilistic algorithms often have a parameter that allows one to tune the error rate, typically by running the algorithm repeatedly. This often gives an error rate of something like $2^{-k}$ for $k$ iterations. This is a fine situation to be in, because $2^{-k}$ can be made as close to $1$ as you like without having to make $k$ very big at all. At this point, the theoretician sits back contentedly, his or her job done.

In practical terms, though, are there any guidelines as to which value of $k$ should one choose? Obviously, there's no universal answer, since the answer in any particular situation will be a trade-off depending on the importance of avoiding errors and the cost of doing more iterations.

For example, choosing $k=10$ gives an error rate of about one in a thousand, which seems rather high for most purposes; choosing $k=60$ means the expected number of errors would still be less than one even if you'd run the algorithm once a second since the big bang.

Solution

This is the arithmetic for Juho's answer. (Run it for the length of time it takes to make the algorithm failure probability equal the hardware failure probability).

Suppose it takes time $t$ seconds to perform one computation, and thus time $kt$ to get the algorithm error probability down to $2^{-k}$.

Suppose that the hardware probability of failure per second is $p$. Then the probability of a hardware failure is approximately $ktp$ (see lemma below). So you want the $k$ that is the solution to

$$ \frac{1}{2^k} = ktp. $$

Lemma

Assume a memoryless failure process with a failure rate per time period of $p$. Let $P=1/p$ (makes the arithmetic easier). Then for very small $p$ (large $P$) the probability of success in $k$ time periods is

$$ 1 - (\frac{P-1}{P})^k = \frac{P^k - (P-1)^k}{P^k}.$$

By the binomial theorem this is

$$ \frac{P^k - P^k +kP^{k-1} + \frac{k(k-1)}{2}P^{k-2} + \cdots }{P^k} \approx \frac{k}{P} = kp.$$

Value of $p$

Also assuming memorylessness and independence and all that good stuff, the numbers in the paper that Juho provided give a failure rate (memory, disk and cpu) of about $1/180$ in 5 days, which is 432000 seconds, so the probability of failure in 1 second is something like $p = 1/77543800 \approx 2^{-26}$, but you want a lower probability than that, so you should probably use $2^{-27}.$

Thus you want k such that

$$ 2^{27-k} \lt kt$$
$$ 27-k \lt \log_2 k + \log_2 t $$
$$ 27 - \log_2 k - \log_2 t \lt k.$$

But for $k > 1$, $\log_2 k > 0$ so choose

$$ k > 27 - \log_2 t. $$

Context

StackExchange Computer Science Q#29091, answer score: 8

Revisions (0)

No revisions yet.