snippetMinor
How do you compute the time complexity of distributed algorithms?
Viewed 0 times
theyoucomputetimealgorithmsdistributedhowcomplexity
Problem
How to compute the run-time of distributed algorithms in message passing systems? I was reading across and found it very weird that any computation done in each node is considered to take $\mathcal{O}(1)$ time due to the unreliability in the time it takes to pass messages. Since this approach is not practical at all, I am assuming that I have not understood it properly. Could someone please explain?
By impractical, I mean that I can simply solve any NP-Hard problem in distributed computing trivially in $\mathcal{O}(n^2)$ time by passing information throughout the network and then brute-forcing for the solution in $\mathcal{O}(1)$ time and this obviously seems stupid since in real life message passing shouldn't take more time than a brute-force solution over the search space.
By impractical, I mean that I can simply solve any NP-Hard problem in distributed computing trivially in $\mathcal{O}(n^2)$ time by passing information throughout the network and then brute-forcing for the solution in $\mathcal{O}(1)$ time and this obviously seems stupid since in real life message passing shouldn't take more time than a brute-force solution over the search space.
Solution
Time complexity is always measured relative to some model. For example, the $\Theta(n \log n)$ bound on sorting is the number of comparisons performed. If comparisons are not constant time, then the total number of operations will be higher.
Because of the high variability, and the overall time taken, often distributed algorithms are measured in terms of the number of messages sent. So in your example, you'd only solve it in $O(n^2)$ time if you assume that there is no cost to sending messages. If you're sending each possibility to a node, a good model will incorporate that cost, and you won't get magically fast solutions to $NP$-hard problems.
Also, for parallel computing, you model (approximately) how many nodes you have. It's a safe-bet to assume that you don't have more than $O(\log n)$ nodes for an input of size $n$. You certainly won't be able to scale your nodes exponentially, so brute-forcing doesn't always work.
Often, the computation done in distributed algorithms is minimal, or the computational content itself is intense, but orthogonal to the distributed nature of the algorithm (i.e. the algorithm is written over some abstract, computationally intensive task). In these cases, you usually only care about the messages sent or some other cost metric, since you might never know how many steps your computation takes.
Because of the high variability, and the overall time taken, often distributed algorithms are measured in terms of the number of messages sent. So in your example, you'd only solve it in $O(n^2)$ time if you assume that there is no cost to sending messages. If you're sending each possibility to a node, a good model will incorporate that cost, and you won't get magically fast solutions to $NP$-hard problems.
Also, for parallel computing, you model (approximately) how many nodes you have. It's a safe-bet to assume that you don't have more than $O(\log n)$ nodes for an input of size $n$. You certainly won't be able to scale your nodes exponentially, so brute-forcing doesn't always work.
Often, the computation done in distributed algorithms is minimal, or the computational content itself is intense, but orthogonal to the distributed nature of the algorithm (i.e. the algorithm is written over some abstract, computationally intensive task). In these cases, you usually only care about the messages sent or some other cost metric, since you might never know how many steps your computation takes.
Context
StackExchange Computer Science Q#74756, answer score: 5
Revisions (0)
No revisions yet.