snippetMajor
How to find 5 repeated values in O(n) time?
Viewed 0 times
howrepeatedtimefindvalues
Problem
Suppose you have an array of size $n \geq 6$ containing integers from $1$ to $n − 5$, inclusive, with exactly five repeated. I need to propose an algorithm that can find the repeated numbers in $O(n)$ time. I cannot, for the life of me, think of anything. I think sorting, at best, would be $O(n\log n)$? Then traversing the array would be $O(n)$, resulting in $O(n^2\log n)$. However, I'm not really sure if sorting would be necessary as I've seen some tricky stuff with linked list, queues, stacks, etc.
Solution
The solution in fade2black's answer is the standard one, but it uses $O(n)$ space. You can improve this to $O(1)$ space as follows:
This algorithm assumes the RAM machine model, in which basic arithmetic operations on $O(\log n)$-bit words take $O(1)$ time.
Another way to formulate this solution is along the following lines:
This solution shows that if we replace 5 by $d$, then we get (I believe) a $O(d^2n)$ algorithm using $O(d^2)$ space, which performs $O(dn)$ arithmetic operations on integers of bit-length $O(d\log n)$, keeping at most $O(d)$ of these at any given time. (This requires careful analysis of the multiplications we perform, most of which involve one operand of length only $O(\log n)$.) It is conceivable that this can be improved to $O(dn)$ time and $O(d)$ space using modular arithmetic.
- Let the array be $A[1],\ldots,A[n]$. For $d=1,\ldots,5$, compute $\sigma_d = \sum_{i=1}^n A[i]^d$.
- Compute $\tau_d = \sigma_d - \sum_{i=1}^{n-5} i^d$ (you can use the well-known formulas to compute the latter sum in $O(1)$). Note that $\tau_d = m_1^d + \cdots + m_5^d$, where $m_1,\ldots,m_5$ are the repeated numbers.
- Compute the polynomial $P(t) = (t-m_1)\cdots(t-m_5)$. The coefficients of this polynomial are symmetric functions of $m_1,\ldots,m_5$ which can be computed from $\tau_1,\ldots,\tau_5$ in $O(1)$.
- Find all roots of the polynomial $P(t)$ by trying all $n-5$ possibilities.
This algorithm assumes the RAM machine model, in which basic arithmetic operations on $O(\log n)$-bit words take $O(1)$ time.
Another way to formulate this solution is along the following lines:
- Calculate $x_1 = \sum_{i=1}^n A[i]$, and deduce $y_1 = m_1 + \cdots + m_5$ using the formula $y_1 = x_1 - \sum_{i=1}^{n-5} i$.
- Calculate $x_2 = \sum_{1 \leq i
- Deduce $y_2 = \sum_{1 \leq i
- Calculate $x_3,x_4,x_5$ and deduce $y_3,y_4,y_5$ along similar lines.
- The values of $y_1,\ldots,y_5$ are (up to sign) the coefficients of the polynomial $P(t)$ from the preceding solution.
This solution shows that if we replace 5 by $d$, then we get (I believe) a $O(d^2n)$ algorithm using $O(d^2)$ space, which performs $O(dn)$ arithmetic operations on integers of bit-length $O(d\log n)$, keeping at most $O(d)$ of these at any given time. (This requires careful analysis of the multiplications we perform, most of which involve one operand of length only $O(\log n)$.) It is conceivable that this can be improved to $O(dn)$ time and $O(d)$ space using modular arithmetic.
Context
StackExchange Computer Science Q#82180, answer score: 25
Revisions (0)
No revisions yet.