HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Is there a generic way to parallelize algorithms?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
genericparallelizewayalgorithmsthere

Problem

Take:

  • An algorithm $A$ that is analytically nice in some way, say, for some class of inputs of length $n$ it runs at $O(f(n))$ on some Turing machine.



  • $k+2$ independent, identical such machines.



Is there a generic method to produce algorithms $A_{in}, A_{out}, A_{1},\dots, A_k$ for which tapes can be made so that:

  • $A_{in}$ finishes after $O(k)$ and transforms a length-$n$ input for $A$ into $k$ inputs, each to one of $A_\ell$, $\ell \in \{1,\dots, k\}.$



  • Each $A_\ell$ runs in $O(f(n)/k)$



  • $A_{out}$ runs in $O(k)$ and combines the the outputs of the $A_\ell$ into what $A$ would have produced on the input of $A_{in}$

Solution

No, there cannot be such a generic method to produce algorithms $A_{in}, A_{out}, A_{1},\dots, A_k$ with the requested properties. The basic problem is that the requested parallel algorithm only uses a single round. (And allowing $A_{in}$ and $A_{out}$ only $O(k)$ time is very restrictive, they cannot even read length-$n$ input or write a length-$n$ output in that time.)

One simple problem which even has a parallel algorithm (but not of the form stated in the question) is the all-prefix-sums operation. Input is a list of numbers, and output is another list, whose elements are the sum of all previous elements in the original list. See section "3.2 Scans" on page 23 in

Parallel Algorithms (64 pages)

https://www.cs.cmu.edu/~guyb/papers/BM04.pdf

by Guy E. Blelloch and Bruce M. Maggs – 1996

There are at least two possible criticisms of this answer. (1) The counter-example problem took a list of size $n$ and output a list of size $n$. Maybe the question expected the output to be much shorter than the input, like in a decision problem. (2) No explicit proof is given that the counter-example problem has no parallel algorithm of the form stated in the question.

Regarding (1), an output of length-$n$ is used to get around Barrington's theorem, as explained in a comment below the question. Regarding (2), one might look at the elements from $n/2$ to $n$ of the output. Each of them would require more than $n/2$ additions, if computed in isolation. $A_{out}$ can only output $O(k)$ of them in $O(k)$ time, so most of them must be computed in partial isolation. Hence, the algorithms $A_i$ outputting them will need more than $O(n/2)$ time.

Context

StackExchange Computer Science Q#85952, answer score: 3

Revisions (0)

No revisions yet.