snippetMinor
Plain-language example of how a functional style makes parallel programming easier
Viewed 0 times
exampleprogrammingmakeslanguagestyleparallelhowfunctionalplaineasier
Problem
I read a few "function >> imperative/OOP" articles because I heard there was a move in imperative OOP languages toward a functional style of coding, especially encouraging pure functions when possible. One recurring argument is that by not mutating state, you don't have to worry about race conditions and locking.
I do get the logic behind that: no two processes would mutate the same data, so locking and race conditions are irrelevant. Problem is I hadn't done any parallel programming or know any functional languages, so it's hard for me to really understand. I got as far as reading about persistent data structures as replacements for large mutable structures, but I hit a wall. I'm looking for an example, in plain language, of a parallel algorithm in an imperative style and a functional style that illustrates this recurring argument.
If it helps to provide a programming problem, let's say I have an array of integers (
I do get the logic behind that: no two processes would mutate the same data, so locking and race conditions are irrelevant. Problem is I hadn't done any parallel programming or know any functional languages, so it's hard for me to really understand. I got as far as reading about persistent data structures as replacements for large mutable structures, but I hit a wall. I'm looking for an example, in plain language, of a parallel algorithm in an imperative style and a functional style that illustrates this recurring argument.
If it helps to provide a programming problem, let's say I have an array of integers (
A). People submit commands (A[i] += 1) in real time to increment an element of that array by 1 (at a valid index i). Different elements are intended to be incremented in parallel. I can imagine the imperative solution does this but it locks an index during an increment. What would a functional solution look like? I will accept answers as simple as naming a functional data structure, if I could understand it by looking it up.Solution
Functional programming is great for parallel programming that is not concurrent. The example problem you gave is inherently concurrent, but we can make it non-concurrent by "batching" the work that needs to be performed.
The batch form of the problem you suggested is like this. Input: length of desired array, and collection of indices to increment. Output: array where at each index we've summed up the number of times that index appears in the input. For example, for an array of size 4 and indices
An imperative solution is exactly like you suggest (I'm using Python code just for pseudocode):
In contrast, a functional approach is as follows. The high level idea is to work with dictionaries mapping indices to values. Then we sum the values of each index by merging dictionaries.
The
The great thing about the functional approach is that we don't have to worry about potential data races. The behavior of the functional code is exactly the same every time you execute it.
The batch form of the problem you suggested is like this. Input: length of desired array, and collection of indices to increment. Output: array where at each index we've summed up the number of times that index appears in the input. For example, for an array of size 4 and indices
[1,0,1,1,0,0,0,3,0,1], the output should be [5,4,0,1] because 0 appears 5 times, 1 appears 4 times, etc.An imperative solution is exactly like you suggest (I'm using Python code just for pseudocode):
# n is size of resulting array
def ImperativeCountIndices(n, indices):
A = [0 for i in range(n)]
for i in indices: # do this step in parallel
A[i] += 1 # this has to be atomic!!
return AIn contrast, a functional approach is as follows. The high level idea is to work with dictionaries mapping indices to values. Then we sum the values of each index by merging dictionaries.
from functools import reduce
def mergeDicts(d1, d2):
allkeys = set(d1.keys()).union(d2.keys())
return {k: d1.get(k, 0) + d2.get(k, 0) for k in allkeys}
def FunctionalCountIndices(n, indices):
d = reduce(mergeDicts, map(lambda i: {i: 1}, indices))
return [d.get(i, 0) for i in range(n)]The
mergeDicts function might look expensive, but actually if you use binary search trees you can make this function extremely efficient and highly parallel (see e.g. https://arxiv.org/abs/1602.02120). The reduce and map functions are great examples of a fundamental "functional programming primitives" which are highly parallel.The great thing about the functional approach is that we don't have to worry about potential data races. The behavior of the functional code is exactly the same every time you execute it.
Code Snippets
# n is size of resulting array
def ImperativeCountIndices(n, indices):
A = [0 for i in range(n)]
for i in indices: # do this step in parallel
A[i] += 1 # this has to be atomic!!
return Afrom functools import reduce
def mergeDicts(d1, d2):
allkeys = set(d1.keys()).union(d2.keys())
return {k: d1.get(k, 0) + d2.get(k, 0) for k in allkeys}
def FunctionalCountIndices(n, indices):
d = reduce(mergeDicts, map(lambda i: {i: 1}, indices))
return [d.get(i, 0) for i in range(n)]Context
StackExchange Computer Science Q#136576, answer score: 3
Revisions (0)
No revisions yet.