principleMinor
exponentially different running times random-access machine (RAM) vs Turing machine
Viewed 0 times
randomexponentiallydifferentrunningmachinetimesturingramaccess
Problem
I propose a simple polynomial algorithm for the following problem. Given $m$ integers each one stored on $n$ bits, output the integer that appears the most often.
- Reserve a memory area with $2^n$ slots/registers/locations/cells, each cell being long enough to store the number $m$. This requires constant time because we do not initialize/touch the cells. If you agree with this, skip to step 2. Otherwise, check the following four points.
- One could easily be tempted to read "reserve a memory area" as "allocate a memory area". Actually, the memory allocation operation is an invention/complication of real multi-tasking machines where programs can "fight" for memory. If I consider some (Random Access Machine) RAM machine that only knows running my algorithm, I can state that the algorithm can simply use the whole memory. Thus, it is enough to reserve (read "put aside" not "allocate") the first $2^n$ cells for a an array with $2^n$ elements. The next memory cells are simply used for other variables.
- To be more precise, I consider running my algorithm on a RAM-TM (Random Access Memory-Turing Machine) that is multi-tape TM with a memory and an index tape. Given a number written on the index tape, the RAM-TM takes constant time to move the head of the memory tape to the location indicated on the index tape. We also allow the RAM-TM to have a few other tapes for easily doing arithmetics. The RAM-TM has no notion of memory allocation. This seems to be a simplification of the various (Transdichotomous or word) RAM machines out there.
- Even if I must confess I do not really know the exact technical details of all theoretical machines RAM (or RASP), it is possible to work with an exponentially-large memory area and yet use only a polynomially large number of cells and my algorithm is not the first doing this. This also happens in the binary search algorithm over a sorted array: $n$ cells but only $O(log(n))$ time complexity. This scientific paper introduced a data s
Solution
Your algorithm involves an array of length $2^n$. Allocating an array of length $2^n$ takes $\Theta(2^n)$ time. Consequently, the running time will be $O(m+2^n)$.
Alternatively, you can represent the array in a sparse way using a self-balancing binary tree data structure. Then each access to the array takes $O(\log m)$ time, so the total running time of your algorithm is $O(m \log m)$ (assuming we can treat $n$ as a constant).
Alternatively, you could use a hash table to replace the array. Then the expected time per access is $O(1)$ if you use a suitable hash function, so your algorithm will have expected running time $O(m)$ (again, treating $n$ as a small constant). However, this is the expected running time, not the worst-case running time; the worst case could be as bad as $O(m^2)$.
You seem to have the idea that you can allocate and work with an array of exponential in constant time. I don't think that's correct. What is true is that you can simulate such an array, at the cost of a logarithmic increase in the running time (using the methods I described above). If you care only about whether the algorithm is polynomial-time or not, you can ignore the logarithmic increase, but if you care about the concrete running time, you can't ignore it.
And if you really care about the precise running time, you should probably specify the model of computation (e.g., transdichotomous model, etc.). For instance, in the transdichotomous model, if $n$ is guaranteed to be at most the word size (i.e., all of your integers fit within a word), then yes, your algorithm works and runs in linear time -- you can allocate/reserve memory for $2^n$ cells without difficulties. However if $n$ is larger than the word size, you can't. So, giving a precise answer to your question requires specifying a particular model of computation.
Related: Saving on array initialization.
Alternatively, you can represent the array in a sparse way using a self-balancing binary tree data structure. Then each access to the array takes $O(\log m)$ time, so the total running time of your algorithm is $O(m \log m)$ (assuming we can treat $n$ as a constant).
Alternatively, you could use a hash table to replace the array. Then the expected time per access is $O(1)$ if you use a suitable hash function, so your algorithm will have expected running time $O(m)$ (again, treating $n$ as a small constant). However, this is the expected running time, not the worst-case running time; the worst case could be as bad as $O(m^2)$.
You seem to have the idea that you can allocate and work with an array of exponential in constant time. I don't think that's correct. What is true is that you can simulate such an array, at the cost of a logarithmic increase in the running time (using the methods I described above). If you care only about whether the algorithm is polynomial-time or not, you can ignore the logarithmic increase, but if you care about the concrete running time, you can't ignore it.
And if you really care about the precise running time, you should probably specify the model of computation (e.g., transdichotomous model, etc.). For instance, in the transdichotomous model, if $n$ is guaranteed to be at most the word size (i.e., all of your integers fit within a word), then yes, your algorithm works and runs in linear time -- you can allocate/reserve memory for $2^n$ cells without difficulties. However if $n$ is larger than the word size, you can't. So, giving a precise answer to your question requires specifying a particular model of computation.
Related: Saving on array initialization.
Context
StackExchange Computer Science Q#78145, answer score: 4
Revisions (0)
No revisions yet.