HiveBrain v1.2.0
Get Started
← Back to all entries
snippetMinor

How to approach analysis of randomized algorithm

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
analysisrandomizedalgorithmhowapproach

Problem

Let us suppose we have a sequence of values $C(i)$ that represent some counter for a given $i$ for $i \in \lbrace 1, \cdots, n \rbrace$. Let us assume some uniform distribution $U$ where selecting any integer between $1$ and $n$ has equal probability. A simple process using this information is to loop for some fixed number of iterations $M$ that, for ease, is a multiple of $n$, and do the following:

  • Randomly obtain some constant $k \leq n$ number of integers from $U$ and place them into a set $V$



  • For $i^ = \arg \min V$, do update $C(i^) \leftarrow C(i^*) + 1$



Ultimately, I want to be able to investigate the quantity $P\left( | C(i) - \frac{1}{n}\sum_{j=1}^n C(j) | \leq \epsilon \right)$ for any $i$. The question I have is if this simple algorithm allows, in probability, for the values of $C(i)$ to remain close to their average values regardless of how many iterations we do. This probability is obviously of a form where one could take advantage of say Hoeffding's inequality but I am not quite sure how to make use of the algorithm structure to get to that point.

I am not very well versed in randomized algorithms and so any insight into how to approach the analysis of such a simple algorithm would be insightful.

Solution

Let $P$ be the distribution of $i^*$, which you can calculate explicitly:
$$
\begin{align*}
p_i := \Pr[P=i] &= \left(1-\frac{i-1}{n}\right)^k - \left(1-\frac{i}{n}\right)^k.
\end{align*}
$$
This is the probability that all samples are at least $i$ minus the probability that they are all at least $i+1$.

The counter $C(i)$ has binomial distribution $\mathrm{Bin}(M,p_i)$, which will be concentrated around $Mp_i$. In contrast, $\frac{1}{n} \sum_{j=1}^n C(j) = \frac{M}{n}$. Therefore you can only expect $C(i)$ to be close to $\frac{1}{n} \sum_{j=1}^n C(j)$ if $p_i \approx \frac{1}{n}$. Even when $p_i \approx \frac{1}{n}$, the concentration of $C(i)$ is not so good – the standard deviation is $\sqrt{Mp_i(1-p_i)} \approx \sqrt{M/n}$, so the probability that $C(i)$ is very close to $M/n$ is probably quite small.

Context

StackExchange Computer Science Q#109420, answer score: 3

Revisions (0)

No revisions yet.