patternMinor
Probability of probing $t$ locations in a Cuckoo hash is $O(\frac{1}{2^{t/2}})$ locations in the worst case
Viewed 0 times
caseprobinglocationsfracthecuckoohashprobabilityworst
Problem
I was told this question may be better received here.
Prove that the probability that an insertion into a cuckoo hash table
probes $t$ array locations is $O(\frac{1}{2^{t/2}})$. Keep in mind that there are two tables, each with size $s \ge 2n$, where $n$ is the number of elements in the set.
I'm trying to use induction, but I don't know if this is the best method to go about proving this.
The worst case to probe $1$ array location would be when all $n$ element are stored in the first table, and we get probability $\frac{n}{2n} = \frac{1}{2} = O(\frac{1}{\sqrt{2}})$ for insertion.
The worst case to probe $2$ array locations would be when all $n$ elements are stored in the first table, we hit the first table, and we succeed in the second table. This has the same probability as it does to probe $1$ array location.
However, I don't know how to continue the analysis. For example, what's the worst case probability to probe $4$ array locations? The question statement implies that the worst case is $O(\frac{1}4)$, but how do we achieve this result? If the first and second tables each have $\frac{1}4$ of their array locations occupied, then the worst case to probe $4$ elements would be hitting in the first table, hitting in the second table, hitting in the first table again, then finally inserting into the second table $\Rightarrow \frac{1}4\cdot\frac{1}4\cdot\frac{1}{4}\cdot\frac{3}4 = \frac{3}{256} \not = \frac{1}4$.
Does anyone have a clearer way of thinking about this problem?
Prove that the probability that an insertion into a cuckoo hash table
probes $t$ array locations is $O(\frac{1}{2^{t/2}})$. Keep in mind that there are two tables, each with size $s \ge 2n$, where $n$ is the number of elements in the set.
I'm trying to use induction, but I don't know if this is the best method to go about proving this.
The worst case to probe $1$ array location would be when all $n$ element are stored in the first table, and we get probability $\frac{n}{2n} = \frac{1}{2} = O(\frac{1}{\sqrt{2}})$ for insertion.
The worst case to probe $2$ array locations would be when all $n$ elements are stored in the first table, we hit the first table, and we succeed in the second table. This has the same probability as it does to probe $1$ array location.
However, I don't know how to continue the analysis. For example, what's the worst case probability to probe $4$ array locations? The question statement implies that the worst case is $O(\frac{1}4)$, but how do we achieve this result? If the first and second tables each have $\frac{1}4$ of their array locations occupied, then the worst case to probe $4$ elements would be hitting in the first table, hitting in the second table, hitting in the first table again, then finally inserting into the second table $\Rightarrow \frac{1}4\cdot\frac{1}4\cdot\frac{1}{4}\cdot\frac{3}4 = \frac{3}{256} \not = \frac{1}4$.
Does anyone have a clearer way of thinking about this problem?
Solution
Basically, you just write down everything that is necessary to have $t$ evictions and then use the universality of your hash functions to bound the probability.
Assume you want to insert $x$ and your hash functions are $f,g$. Then if you have $t$ probes if $f(x)$ is occupied that is
$$
\begin{align}
\mathbb{P}[t=1]& =\sum_{y \neq x} \mathbb{P}[f(x)=f(y)]\\
&\le n \cdot 1/s \le 1/2.
\end{align}$$
since your hash function are $(1,k)$-universal (I assume).
Now for $t=2$, the place of $f(x)$ have to be occupied by some element $y$ (that is $f(x)=f(y)$), and the alternative for $y$ is blocked by some other element $z$ (that is $g(y)=g(z)$). Summing up over all possible values for $y$ and $z$ gives
$$
\begin{align}
\mathbb{P}[t=2]& =\sum_{y,z \neq x} \mathbb{P}[f(x)=f(y) \text { and } g(y)=g(z) ]]\\
&\le n^2 \cdot 1/s^2 \le 1/4.
\end{align}$$
The summands of the sum are bounded by $1/s^2$ due to the universality of the hash functions, and you have less than $n^2$ summands.
I think from here you can continue by yourself.
Assume you want to insert $x$ and your hash functions are $f,g$. Then if you have $t$ probes if $f(x)$ is occupied that is
$$
\begin{align}
\mathbb{P}[t=1]& =\sum_{y \neq x} \mathbb{P}[f(x)=f(y)]\\
&\le n \cdot 1/s \le 1/2.
\end{align}$$
since your hash function are $(1,k)$-universal (I assume).
Now for $t=2$, the place of $f(x)$ have to be occupied by some element $y$ (that is $f(x)=f(y)$), and the alternative for $y$ is blocked by some other element $z$ (that is $g(y)=g(z)$). Summing up over all possible values for $y$ and $z$ gives
$$
\begin{align}
\mathbb{P}[t=2]& =\sum_{y,z \neq x} \mathbb{P}[f(x)=f(y) \text { and } g(y)=g(z) ]]\\
&\le n^2 \cdot 1/s^2 \le 1/4.
\end{align}$$
The summands of the sum are bounded by $1/s^2$ due to the universality of the hash functions, and you have less than $n^2$ summands.
I think from here you can continue by yourself.
Context
StackExchange Computer Science Q#33671, answer score: 2
Revisions (0)
No revisions yet.