HiveBrain v1.2.0
Get Started
← Back to all entries
debugMajor

Array access is O(1) implies a fixed index size, which implies O(1) array traversal?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
arraysizewhichimpliesfixedindexaccesstraversal

Problem

Arrays are generally presented as data structures with $\Theta(N)$ traversal and $\Theta(1)$ random element access. However, this seems inconsistent:

-
if array access is really $\Theta(1)$, this means that the size of an element index is bounded by a constant (e.g., int64), since it can be processed in $\Theta(1)$. However, this implies that the array size is bounded by a constant (e.g., 264 elements), which makes traversal $\Theta(1)$.

-
If traversal is $\Theta(N)$, then the size of the index has an information theoretic lower bound of $\Theta(\log N)$. Accessing an arbitrary $k$-th element requires processing the entire integer k, which has $\Omega(\log N)$ worst-case time-complexity (as $k$ can be arbitrary large).

In which model can both "standard complexities" defined for arrays, $\Theta(N)$ traversal and $\Theta(1)$ access simultaneously be true without the model being inconsistent?

Solution

It's a good question. From a pragmatic perspective, we tend not to worry about it.

From a theoretical perspective, see the transdichotomous model. In particular, a standard assumption is that there is some integer $M$ that is large enough (i.e., larger than the input size; larger than the maximum size of memory needed), and then we assume that each word of memory can store $\lg M$ bits, and we assume that each memory access takes $O(1)$ time.

Notice how this solves your paradox. In particular, we'll have $N \le M$, so there is enough space in the array to store the entire array. Also, each array access really does take $O(1)$ time. The solution here is that $M$ grows with the size of the array, and is not a single fixed constant.

You can think of the transdichotomous model as assuming that we'll build a machine that is large enough to handle the data we're processing. Of course, the larger the data you have, the more bits you need to have in the address, so we need a wider bus to memory, so the cost grows -- but we just "ignore" this or assume it grows as needed.

There is a sense in which this model is cheating a little bit. While the time complexity of a memory fixed is $O(1)$ (fixed, independent of $N$ or $M$), the dollar cost of the computer does grow with $N$ or $M$: to handle a larger value of $M$, we need a larger data bus to memory, so the computer costs more. So there is a sense in which the transdichotomous model is cheating, and the cost of such a computer will go up as $\Theta(M \log M)$, not as $\Theta(M)$. In effect, the transdichotomous model is assuming we use increasing amounts of parallelism as the size of our data grows. You could argue that it is realistic, or that it is "sweeping under the rug" the cost of increasing the size of data buses, etc. I think both viewpoints have some validity.

This is analogous to how we think about Turing machines as a reasonable model for everyday computers. Of course, any one computer has a fixed amount of memory, so in principle we could say that it is a finite-state machine (with a gigantic but fixed number of possible states), so it can only accept a regular language, not all decidable languages. But that isn't a very useful viewpoint. A way out of that is to assume that when our computer runs out of storage, we go out and buy more hard drives, as many as are needed, to store the data we are working with, so its amount of memory is unlimited, and thus a Turing machine is a good model. A pragmatic answer is that Turing machines are closer to every computer than finite automata.

Context

StackExchange Computer Science Q#152613, answer score: 31

Revisions (0)

No revisions yet.