HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Determining the particular number in $O(n)$ time and space (worst case)

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
casedeterminingthenumberspacetimeandparticularworst

Problem

$\newcommand\ldotd{\mathinner{..}}$Given that $A[1\ldotd n]$ are integers such that $0\le A[k]\le m$ for all $1\le k\le n$, and the occurrence of each number except a particular number in $A[1\ldotd n]$ is an odd number. Try to find the number whose occurrence is an even number.

There is an $\Theta(n\log n)$ algorithm: we sort $A[1\ldotd n]$ into $B[1\ldotd n]$, and break $B[1\ldotd n]$ into many pieces, whose elements' value are the same, therefore we can count the occurrence of each element.

I want to find a worst-case-$O(n)$-time-and-$O(n)$-space algorithm.

Supposing that $m=\Omega(n^{1+\epsilon})$ and $\epsilon>0$, therefore radix sort is not acceptable.
$\DeclareMathOperator{\xor}{xor}$
Binary bitwise operations are acceptable, for example, $A[1]\xor A[2]$.

Solution

Here is an idea for a simple algorithm; just count all occurrences!

  • Find $m = \max A$. -- time $\Theta(n)$



  • "Allocate" array $C[0..m]$. -- time $O(1)$¹



  • Iterate over $A$ and increase $C[x]$ by one whenever you find $A[\_]=x$. If $C[x]$ was $0$, add $x$ to a linear list $L$. -- time $\Theta(n)$



  • Iterate over $L$ and find the element $x_e$ with $C[x_e]$ even. -- time $O(n)$.



  • Return $x_e$.



All in all, this gives you a linear-time algorithm which may use (in the sense of allocating) lots of memory. Note that being able to random-access $C$ in constant time independently of $m$ is crucial here.

An additional $O(n)$ bound on space is more difficult with this approach; I don't know of any dictionary data-structure that offers $O(1)$ time lookup. You can use hash-tables for which here are implementations with $O(1 + k/n)$ expected lookup time ($n$ the table's size, $k$ the number of stored elements) so you can get arbitrarily good with linear space -- in expectation. If all values in $A$ map to the same hash value, you are screwed.

  • On a RAM, this is implicitly done; all we need is the start position and maybe the end position.

Context

StackExchange Computer Science Q#2863, answer score: 2

Revisions (0)

No revisions yet.