HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Approximating π via Monte Carlo simulation

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
montesimulationviaapproximatingcarlo

Problem

Inspired by a tweet linked to me by a friend and a Haskell implementation by her for the same problem, I decided to try my hand at approximating the value of π using everything in the Haskell standard library I could find for the job. Here’s what I came up with:

module Pi where

import Data.List (genericLength)
import Control.Arrow (Arrow, ( [(a, a)]
chunk2 []      = []
chunk2 [_]     = error "list of uneven length"
chunk2 (x:y:r) = (x, y) : chunk2 r

both :: Arrow arr => arr a b -> arr (a, a) (b, b)
both f = f *** f

unsplit :: Arrow arr => (a -> b -> c) -> arr (a, b) c
unsplit = arr . uncurry

randomFloats :: IO [Float]
randomFloats = randoms  newStdGen

randomPoints :: IO [Point Float]
randomPoints = chunk2  randomFloats

isInUnitCircle :: (Floating a, Ord a) => Point a -> Bool
isInUnitCircle (x, y) = x' + y'  [b] -> [b] -> c
lengthRatio = curry (unsplit (/)  Float
approximatePi points = circleRatio * 4.0
  where circlePoints = filter isInUnitCircle points
        circleRatio = circlePoints `lengthRatio` points

main :: IO ()
main = do
  putStrLn "How many points do you want to generate to approximate π?"
  numPoints  getLine
  points  randomPoints
  print $ approximatePi points


I’m interested in a general review, but I’m especially curious about my use of arrows: is there a better way to write lengthRatio? Are anything like both and unsplit provided anywhere in the standard library? If not, do any packages help?

Solution

About those arrows


I’m interested in a general review, but I’m especially curious about my use of arrows: is there a better way to write lengthRatio?

Compare the following two lines. Both do the same, but which one would you rather see if you need to change your code drunk in three months, with only 5% battery left?

lengthRatio = curry (unsplit (/) <<< both genericLength)

lengthRatio xs ys = genericLength xs / genericLength ys


Also, which one has which type, and which one is more general?

Arrows are great if you want to abstract functions. But throughout your small script, you're still just working with (->), not any other instance of Arrow. For a small script like this, Arrow is too much. For example the pointwise definition above is actually a character shorter than the pointfree one. Sure, the pointfree one is clever, but it's also very beginner-unfriendly.

About randomness

randomPoints introduces a dependency between your point coordinates \$x\$ and \$y\$, since both draw from the same sequence. This usually leads to points on hyperplanes (see disadvantages of LCG and spectral test). Your friends variant doesn't have this immediate problem:

randomTuples :: Int -> IO [(Float, Float)]
randomTuples n = do
  seed1 <- newStdGen
  seed2 <- newStdGen
  let xs = randoms seed1 :: [Float] -- two different
      ys = randoms seed2 :: [Float] -- generators being used
  return $ take n $ zipWith (,) xs ys


However, since newStdGen is merely a split, it's more or less hiding the dependency at another place. Still, it's something to keep in mind, if you don't want to end up with something like this.

But how would you check this? Well, you would run tests, over and over. Here's the second design critique on randomPoints, it doesn't take a RandomGen. Truth be told, if I say that Arrow is too much for a small script, then

randomPoints :: RandomGen g => g -> [Point Float]


is too much either.

Also, if you know you're going to generate Points, a newtype Point a together with

instance Random a => Random (Point a) where


is feasible and doesn't introduce a potential error via chunk2. Keep possible problems with Random in mind, though.

About names

The function isInUnitCircle lies. It's not testing whether the point \$(x,y)\$ lies in the circle with radius \$r = 1\$ with center in the origin, e.g.
$$
\sqrt{x^2 + y^2} \le 1^2 \Leftrightarrow x^2 + y^2 \le 1
$$
but in the circle with diameter \$d = 2r = 1\$ with center in \$(0.5, 0.5)\$. In the following picture, the green region is where you generate your random values. In the left one, you see the regular unit circle, in the right one, you see the circle size you're actually testing (after shifting your values from the green square into the red one):

Therefore, you're not calculating the "usual" fourth of a circle, but instead a circle with a fourth of the original size (\$\pi(\frac{1}{2})^2 = \frac{\pi}{4}\$)). Luckily, it doesn't matter for the convergence.

A real test that checks whether a point is in the unit circle is tremendously easier:

isInUnitCircle :: (Num a, Ord a) => Point a -> Bool
isInUnitCircle (x, y) = x ^ 2 + y ^ 2 <= 1


About optimization

Last, but not least, there's an issue with approximatePi, or rather the use of lengthRatio on the same list twice. Actually, taking the length of the list again is a litle bit strange, since you know how large the sample is:

numPoints  getLine                -- sample size
points     randomPoints
print $ approximatePi points                 -- sample size still known (?)


But let's say that you don't actually know how many points you have. Let's assume that someone wants to check a many points. Suddenly, the memory usage of your program explodes:

$ echo 10000000 | ./CalcPi +RTS -s
How many points do you want to generate to approximate π?
3.141744
33,724,505,920 bytes allocated in the heap
5,288,250,096 bytes copied during GC
1,319,621,976 bytes maximum residency (17 sample(s))
5,554,344 bytes maximum slop
2587 MB total memory in use (0 MB lost due to fragmentation)

Tot time (elapsed) Avg pause Max pause
Gen 0 63626 colls, 0 par 1.606s 3.155s 0.0000s 0.0006s
Gen 1 17 colls, 0 par 2.732s 3.373s 0.1984s 1.2720s

INIT time 0.000s ( 0.000s elapsed)
MUT time 17.158s ( 15.542s elapsed)
GC time 4.337s ( 6.528s elapsed)
EXIT time 0.019s ( 0.166s elapsed)
Total time 21.514s ( 22.236s elapsed)

%GC time 20.2% (29.4% elapsed)

Alloc rate 1,965,530,983 bytes per MUT second

Productivity 79.8% of total user, 77.2% of total elapsed

Even though randoms generates a lazy list, approximatePi needs to hold onto it completely due to lengthRatio. A classic space leak. The altnerative version of lengthRatio won't save you from that. Instead, provide a

Code Snippets

lengthRatio = curry (unsplit (/) <<< both genericLength)

lengthRatio xs ys = genericLength xs / genericLength ys
randomTuples :: Int -> IO [(Float, Float)]
randomTuples n = do
  seed1 <- newStdGen
  seed2 <- newStdGen
  let xs = randoms seed1 :: [Float] -- two different
      ys = randoms seed2 :: [Float] -- generators being used
  return $ take n $ zipWith (,) xs ys
randomPoints :: RandomGen g => g -> [Point Float]
instance Random a => Random (Point a) where
isInUnitCircle :: (Num a, Ord a) => Point a -> Bool
isInUnitCircle (x, y) = x ^ 2 + y ^ 2 <= 1

Context

StackExchange Code Review Q#117441, answer score: 6

Revisions (0)

No revisions yet.