patternMinor
Haskell Particle Simulation
Viewed 0 times
particlesimulationhaskell
Problem
I recently started learning Haskell and as my first project I decided to port a particle simulation I had written in C.
The simulation is pretty simple. We have a cubic box with particles (spheres) inside it. Then we choose to either move a particle or to change the volume of the box and check for any collisions between the particles. If there are any collisions we keep the old configuration.
To speed up collision detection I have also implemented a 'cell linked list' which is in principle a uniform grid with a minimum size of a cell equal to the particle diameter. In the C code I have implemented this using arrays while in the Haskell code I use a Vector of Lists.
You can find my full code here, but here are some of the most important snippets,
The main Simulation Loop, where type
Here I'm effectively randomly choosing to either displace a particle or change the volume and every 100 steps I print useful information.
These two moves
```
moveParticle :: (RealFrac a, Floating a) => Vec3 a -> Int -> StateIO a ()
moveParticle dx nPart = do
SimState config@(Configuration box particles) cll variables a -> Eval a ()
changeVolume dv = do
SimState config@(Configuration box particles) cll variables 2.0) new_cell_size || any (> 2) (zipWith ((abs .) . (-)) new_n (_nCells cll))
cll' = if recr
The simulation is pretty simple. We have a cubic box with particles (spheres) inside it. Then we choose to either move a particle or to change the volume of the box and check for any collisions between the particles. If there are any collisions we keep the old configuration.
To speed up collision detection I have also implemented a 'cell linked list' which is in principle a uniform grid with a minimum size of a cell equal to the particle diameter. In the C code I have implemented this using arrays while in the Haskell code I use a Vector of Lists.
You can find my full code here, but here are some of the most important snippets,
The main Simulation Loop, where type
Eval a = ReaderT (Env a) (StateT (SimState a) IO)
runSimulation :: Int -> Eval Double ()
runSimulation steps = do
Env npart ps do
forM_ [1..npart+1] (\_ -> do
SimState config _ variables >= \r -> return $ take 3 r
rn 0.3 && olddx 0.15 then 1.04 else 0.94)
liftIO $ print i
liftIO $ printf "dx = %.6f, dv = %.6f, nMov = %d, nVol = %d\n" newdx newdv (_nMov variables) (_nVol variables)
put $! state{_vars = variables{_dx = newdx, _dv = newdv, _nMov = 0, _nVol = 0}}
printConfig $ "Data/test" ++ show i ++ ".dat"
Here I'm effectively randomly choosing to either displace a particle or change the volume and every 100 steps I print useful information.
These two moves
moveParticle, changeVolume are listed below,```
moveParticle :: (RealFrac a, Floating a) => Vec3 a -> Int -> StateIO a ()
moveParticle dx nPart = do
SimState config@(Configuration box particles) cll variables a -> Eval a ()
changeVolume dv = do
SimState config@(Configuration box particles) cll variables 2.0) new_cell_size || any (> 2) (zipWith ((abs .) . (-)) new_n (_nCells cll))
cll' = if recr
Solution
Heap profiling will not tell you much in this case - memory leaking is definitely not your problem. Using cost-centre profiling would probably be better suited for identification of the hot-spots.
After looking at the profile a bit (using home-grown tools), the worst offenders seem to be:
-
Lots of calls to
To really get destructive updates, you would probably have to explicitly use a
-
Additionally, the
-
Finally, the default random number generator is slow - any time you use it you get a flurry of
Actually, I might have mis-interpreted the profile for point number two a bit - the following is actually a big no-no:
Haskell can do a lot of optimization, but only if it has enough information. As it stands,
Meaning that it traverse the list, checks that it contains exactly three elements, and then unpacks every
Concerning my original point:
Now this is arguably nice code - even though I personally don't like overly tricky
GHC will not unroll these, which means that you have constructed a multiple-level loop doing trivial list concatenations in a completely predictable manner.
If you for example wrote it as:
That would make
Some points on the structure:
-
You should try to isolate your simulation code as much as possible. Right now it is in "main.hs", which should normally just handle initialization and configuration. Splitting out the actual maths bits would be a pretty straightforward improvement.
-
On a similar note, you already have a monad, but aren't using it as an encapsulation tool - your core simulation code still uses
After looking at the profile a bit (using home-grown tools), the worst offenders seem to be:
-
Lots of calls to
stg_newArray, undoubtedly due to the CellList vector getting copied by modify. The documentation says that modify can update destructively, but I am pretty sure that this only refers to situations where the source Vector can be derived to be a temporary value available for fusion.To really get destructive updates, you would probably have to explicitly use a
MVector and the ST monad. As you already have a monad around, the amount of refactoring involved wouldn't be crushing, but you would obviously lose a bit of purity.-
Additionally, the
neighbours, getCell and cellNeighbours functions are getting hammered by your program. You are doing some pretty fancy list processing in there that GHC seems to not optimize very well - pretty much all these lists are actually generated and consumed. Helping GHC a bit by spelling out the loops might improve things.-
Finally, the default random number generator is slow - any time you use it you get a flurry of
Integer allocations behind the scenes. Try to find a better one on Hackage.Actually, I might have mis-interpreted the profile for point number two a bit - the following is actually a big no-no:
type CellIndex = [Int] -- The index of a cell [i, j, k]Haskell can do a lot of optimization, but only if it has enough information. As it stands,
getCell will get a prefix like:case y5 of wild18 {
[] -> lvl9;
: i1 ds8 -> case ds8 of wild19 {
[] -> lvl9;
: j1 ds9 -> case ds9 of wild20 {
[] -> lvl9;
: k1 ds10 -> case ds10 of wild21 {
[] -> case i1 of wild22 { GHC.Types.I# y6 ->
case j1 of wild23 { GHC.Types.I# y7 ->
case k1 of wild24 { GHC.Types.I# y8 -> ... } } };
: ipv1 ipv2 -> lvl9
} } } }Meaning that it traverse the list, checks that it contains exactly three elements, and then unpacks every
Int separately. By passing (Int, Int, Int) or your own Vec3 you would give Haskell enough information to derive that it can skip all the heap allocation.Concerning my original point:
cellNeighbours (CellList !nCells _) cell = do
i <- [-1..1]
j <- [-1..1]
k <- [-1..1]
let temp = zipWith3 (((+).).(+)) nCells cell [i, j, k]
return $! zipWith mod temp nCellsNow this is arguably nice code - even though I personally don't like overly tricky
(.) constructions or usage of the list monad in the first place (list comprehensions!). The performance problem here is that this becomes something like follows after desugaring:concatMap (\i -> concatMap (\j -> concatMap (\k -> [...]) [-1..1]) [-1..1]) [-1..1])GHC will not unroll these, which means that you have constructed a multiple-level loop doing trivial list concatenations in a completely predictable manner.
If you for example wrote it as:
cellNeighbours (CellList (d,e,f) _) (x,y,z) = map add offsets
where add (a,b,c) = (x+a+d, y+b+e, z+c+f)
offsets = [(a,b,c) | a <- [-1..1], b <- [-1..1], c <- [-1..1]]That would make
offsets a global constant that only gets evaluated once and would allow GHC to fuse the map with the concatMap in neighbours, eliminating the intermediate list completely.Some points on the structure:
-
You should try to isolate your simulation code as much as possible. Right now it is in "main.hs", which should normally just handle initialization and configuration. Splitting out the actual maths bits would be a pretty straightforward improvement.
-
On a similar note, you already have a monad, but aren't using it as an encapsulation tool - your core simulation code still uses
put directly. Try to find out what "verbs" you are really using, then define some simple wrappers for those (say increaseParticleMoves?). After you have eliminated all direct access, you can then export the monad and all actions to its own module, newtype-wrap it and stop exporting its definition. Having this sort of infrastructure would for example make the switch to ST much easier.Code Snippets
type CellIndex = [Int] -- The index of a cell [i, j, k]case y5 of wild18 {
[] -> lvl9;
: i1 ds8 -> case ds8 of wild19 {
[] -> lvl9;
: j1 ds9 -> case ds9 of wild20 {
[] -> lvl9;
: k1 ds10 -> case ds10 of wild21 {
[] -> case i1 of wild22 { GHC.Types.I# y6 ->
case j1 of wild23 { GHC.Types.I# y7 ->
case k1 of wild24 { GHC.Types.I# y8 -> ... } } };
: ipv1 ipv2 -> lvl9
} } } }cellNeighbours (CellList !nCells _) cell = do
i <- [-1..1]
j <- [-1..1]
k <- [-1..1]
let temp = zipWith3 (((+).).(+)) nCells cell [i, j, k]
return $! zipWith mod temp nCellsconcatMap (\i -> concatMap (\j -> concatMap (\k -> [...]) [-1..1]) [-1..1]) [-1..1])cellNeighbours (CellList (d,e,f) _) (x,y,z) = map add offsets
where add (a,b,c) = (x+a+d, y+b+e, z+c+f)
offsets = [(a,b,c) | a <- [-1..1], b <- [-1..1], c <- [-1..1]]Context
StackExchange Code Review Q#26081, answer score: 8
Revisions (0)
No revisions yet.