patternModerate
Determine missing number in data stream
Viewed 0 times
streamnumberdeterminemissingdata
Problem
We receive a stream of $n-1$ pairwise different numbers from the set $\left\{1,\dots,n\right\}$.
How can I determine the missing number with an algorithm that reads the stream once and uses a memory of only $O(\log_2 n)$ bits?
How can I determine the missing number with an algorithm that reads the stream once and uses a memory of only $O(\log_2 n)$ bits?
Solution
From the comment above:
Before processing the stream, allocate $\lceil \log_2 n \rceil$ bits, in which you write $x:= \bigoplus_{i=1}^n \mathrm{bin}(i)$ ($\mathrm{bin}(i)$ is the binary representation of $i$ and $\oplus$ is pointwise exclusive-or). Naively, this takes $\mathcal{O}(n)$ time.
Upon processing the stream, whenever one reads a number $j$, compute $x := x \oplus \mathrm{bin}(j)$. Let $k$ be the single number from $\{ 1, ... n\}$ that is not included in the stream. After having read the whole stream, we have
$$ x = \left(\bigoplus_{i=1}^n \mathrm{bin}(i)\right) \oplus \left(\bigoplus_{i \neq k } \mathrm{bin}(i)\right)
= \mathrm{bin}(k) \oplus \bigoplus_{i \neq k } (\mathrm{bin}(i) \oplus \mathrm{bin}(i)) = \mathrm{bin}(k), $$
yielding the desired result.
Hence, we used $\mathcal{O}(\log n)$ space, and have an overall runtime of $\mathcal{O}(n)$.
Before processing the stream, allocate $\lceil \log_2 n \rceil$ bits, in which you write $x:= \bigoplus_{i=1}^n \mathrm{bin}(i)$ ($\mathrm{bin}(i)$ is the binary representation of $i$ and $\oplus$ is pointwise exclusive-or). Naively, this takes $\mathcal{O}(n)$ time.
Upon processing the stream, whenever one reads a number $j$, compute $x := x \oplus \mathrm{bin}(j)$. Let $k$ be the single number from $\{ 1, ... n\}$ that is not included in the stream. After having read the whole stream, we have
$$ x = \left(\bigoplus_{i=1}^n \mathrm{bin}(i)\right) \oplus \left(\bigoplus_{i \neq k } \mathrm{bin}(i)\right)
= \mathrm{bin}(k) \oplus \bigoplus_{i \neq k } (\mathrm{bin}(i) \oplus \mathrm{bin}(i)) = \mathrm{bin}(k), $$
yielding the desired result.
Hence, we used $\mathcal{O}(\log n)$ space, and have an overall runtime of $\mathcal{O}(n)$.
Context
StackExchange Computer Science Q#2079, answer score: 16
Revisions (0)
No revisions yet.