HiveBrain v1.2.0
Get Started
← Back to all entries
snippetMinor

Plain-language example of how a functional style makes parallel programming easier

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
exampleprogrammingmakeslanguagestyleparallelhowfunctionalplaineasier

Problem

I read a few "function >> imperative/OOP" articles because I heard there was a move in imperative OOP languages toward a functional style of coding, especially encouraging pure functions when possible. One recurring argument is that by not mutating state, you don't have to worry about race conditions and locking.

I do get the logic behind that: no two processes would mutate the same data, so locking and race conditions are irrelevant. Problem is I hadn't done any parallel programming or know any functional languages, so it's hard for me to really understand. I got as far as reading about persistent data structures as replacements for large mutable structures, but I hit a wall. I'm looking for an example, in plain language, of a parallel algorithm in an imperative style and a functional style that illustrates this recurring argument.

If it helps to provide a programming problem, let's say I have an array of integers (A). People submit commands (A[i] += 1) in real time to increment an element of that array by 1 (at a valid index i). Different elements are intended to be incremented in parallel. I can imagine the imperative solution does this but it locks an index during an increment. What would a functional solution look like? I will accept answers as simple as naming a functional data structure, if I could understand it by looking it up.

Solution

Functional programming is great for parallel programming that is not concurrent. The example problem you gave is inherently concurrent, but we can make it non-concurrent by "batching" the work that needs to be performed.

The batch form of the problem you suggested is like this. Input: length of desired array, and collection of indices to increment. Output: array where at each index we've summed up the number of times that index appears in the input. For example, for an array of size 4 and indices [1,0,1,1,0,0,0,3,0,1], the output should be [5,4,0,1] because 0 appears 5 times, 1 appears 4 times, etc.

An imperative solution is exactly like you suggest (I'm using Python code just for pseudocode):

# n is size of resulting array
def ImperativeCountIndices(n, indices):
  A = [0 for i in range(n)]
  for i in indices:   # do this step in parallel
    A[i] += 1         # this has to be atomic!!
  return A


In contrast, a functional approach is as follows. The high level idea is to work with dictionaries mapping indices to values. Then we sum the values of each index by merging dictionaries.

from functools import reduce

def mergeDicts(d1, d2):
  allkeys = set(d1.keys()).union(d2.keys())
  return {k: d1.get(k, 0) + d2.get(k, 0) for k in allkeys}

def FunctionalCountIndices(n, indices):
  d = reduce(mergeDicts, map(lambda i: {i: 1}, indices))
  return [d.get(i, 0) for i in range(n)]


The mergeDicts function might look expensive, but actually if you use binary search trees you can make this function extremely efficient and highly parallel (see e.g. https://arxiv.org/abs/1602.02120). The reduce and map functions are great examples of a fundamental "functional programming primitives" which are highly parallel.

The great thing about the functional approach is that we don't have to worry about potential data races. The behavior of the functional code is exactly the same every time you execute it.

Code Snippets

# n is size of resulting array
def ImperativeCountIndices(n, indices):
  A = [0 for i in range(n)]
  for i in indices:   # do this step in parallel
    A[i] += 1         # this has to be atomic!!
  return A
from functools import reduce

def mergeDicts(d1, d2):
  allkeys = set(d1.keys()).union(d2.keys())
  return {k: d1.get(k, 0) + d2.get(k, 0) for k in allkeys}

def FunctionalCountIndices(n, indices):
  d = reduce(mergeDicts, map(lambda i: {i: 1}, indices))
  return [d.get(i, 0) for i in range(n)]

Context

StackExchange Computer Science Q#136576, answer score: 3

Revisions (0)

No revisions yet.