patternpythonMinor
Monte Carlo simulation of amoeba population
Viewed 0 times
amoebamontesimulationpopulationcarlo
Problem
I wrote a simple Python simulation to answer the "Amoeba" population question posed here:
A population of amoebas starts with 1. After 1 period that amoeba can divide into 1, 2, 3, or 0 (it can die) with equal probability. What is the probability that the entire population dies out eventually?
The script seems to work fine, but then I started playing with it to investigate curious effects.
If the number of trial generations is too large (
I would appreciate feedback on efficiency options, and also on general code improvements. I know the runs are independent and could be run in parallel. But that is still sort of brute force. I am curious more about making a single thread approach faster.
```
from __future__ import division
import random
import math
import time
def run_trial(max_split, num_generations):
population = 1
for generation in xrange(num_generations):
for amoeba in xrange(population):
amoeba_split = random.randint(0, max_split)
population -= 1 # remove current amoeba (she will split or die)
population += amoeba_split
if population == 0:
break
return population
def main():
extinct_counter = 0
trials = 10000
max_split_per_amoeba = 3
num_generations_per_trial = 20 # populations can get massive as generations increase (memory / overflow errors at 100)
print 'starting simulation'
print 'num trials: %i' % (trials)
print 'max_split_per_amoeba: %i' % (max_split_per_amoeba)
print 'num_generations_per_trial: %i' % (num_generations_per_trial)
for trial in xrange(trials):
outcome_population = run_trial(max_split_per_amoeba, num_generations_per_trial)
if outcome_population == 0:
extinct_counter += 1
if divmod(trial
A population of amoebas starts with 1. After 1 period that amoeba can divide into 1, 2, 3, or 0 (it can die) with equal probability. What is the probability that the entire population dies out eventually?
The script seems to work fine, but then I started playing with it to investigate curious effects.
If the number of trial generations is too large (
num_generations_per_trial), I have problems with performance - the population size gets huge, and the simulation either runs slow or I encounter OverflowError on my brute force FOR loops.I would appreciate feedback on efficiency options, and also on general code improvements. I know the runs are independent and could be run in parallel. But that is still sort of brute force. I am curious more about making a single thread approach faster.
```
from __future__ import division
import random
import math
import time
def run_trial(max_split, num_generations):
population = 1
for generation in xrange(num_generations):
for amoeba in xrange(population):
amoeba_split = random.randint(0, max_split)
population -= 1 # remove current amoeba (she will split or die)
population += amoeba_split
if population == 0:
break
return population
def main():
extinct_counter = 0
trials = 10000
max_split_per_amoeba = 3
num_generations_per_trial = 20 # populations can get massive as generations increase (memory / overflow errors at 100)
print 'starting simulation'
print 'num trials: %i' % (trials)
print 'max_split_per_amoeba: %i' % (max_split_per_amoeba)
print 'num_generations_per_trial: %i' % (num_generations_per_trial)
for trial in xrange(trials):
outcome_population = run_trial(max_split_per_amoeba, num_generations_per_trial)
if outcome_population == 0:
extinct_counter += 1
if divmod(trial
Solution
Performance
Limiting each trial to 20 generations is not really helpful. Let's put it this way: the population is expected to increase exponentially with each generation. Once the population reaches, say, 10 amoebas, it's pretty safe to declare that infection has taken hold and the population will never dwindle to 0.
If you run a trial for 20 generations, you can easily get populations in the thousands. The problem with letting populations grow so large is that you have to decide the fate of each individual amoeba at each time step, and that ends up being a lot of work.
If you change
Expressiveness
I suggest writing a generator for the population count at each time step. You can easily count elements using the
Limiting each trial to 20 generations is not really helpful. Let's put it this way: the population is expected to increase exponentially with each generation. Once the population reaches, say, 10 amoebas, it's pretty safe to declare that infection has taken hold and the population will never dwindle to 0.
If you run a trial for 20 generations, you can easily get populations in the thousands. The problem with letting populations grow so large is that you have to decide the fate of each individual amoeba at each time step, and that ends up being a lot of work.
If you change
num_generations_per_trial to 10, you'll get identical results, and finish much more quickly. But how do you decide what an appropriately low num_generations_per_trial number is? It would be much more intuitive to impose a cap on the population size than on the number of generations.Expressiveness
I suggest writing a generator for the population count at each time step. You can easily count elements using the
sum() function.from __future__ import division
from random import randint
def amoeba_generations(max_split=3, population=1):
yield population
while population > 0:
population = sum(randint(0, max_split) for _ in range(population))
yield population
def trial(generator, success_pred, failure_pred):
for g in generator:
if success_pred(g):
return True
if failure_pred(g):
return False
return False
def main():
num_trials = 10000
max_split_per_amoeba = 3
num_dead_populations = sum(
trial(
amoeba_generations(max_split_per_amoeba),
lambda n: n == 0,
lambda n: n >= 10 # Assume a population of >= 10 will never die
)
for _ in range(num_trials)
)
print("""***starting simulation***
num trials: {num_trials}
max_split_per_amoeba: {max_split_per_amoeba}
extinct outcomes: {num_dead_populations}
extinction probability: {extinction_probability:.4f}
expected probability: {expected_probability:.4f}
delta from answer: {delta:.4f}""".format(
extinction_probability=num_dead_populations/num_trials,
expected_probability=2**.5 - 1,
delta=(num_dead_populations/num_trials) - (2**.5 - 1),
**locals()
))
if __name__ == '__main__':
main()Code Snippets
from __future__ import division
from random import randint
def amoeba_generations(max_split=3, population=1):
yield population
while population > 0:
population = sum(randint(0, max_split) for _ in range(population))
yield population
def trial(generator, success_pred, failure_pred):
for g in generator:
if success_pred(g):
return True
if failure_pred(g):
return False
return False
def main():
num_trials = 10000
max_split_per_amoeba = 3
num_dead_populations = sum(
trial(
amoeba_generations(max_split_per_amoeba),
lambda n: n == 0,
lambda n: n >= 10 # Assume a population of >= 10 will never die
)
for _ in range(num_trials)
)
print("""***starting simulation***
num trials: {num_trials}
max_split_per_amoeba: {max_split_per_amoeba}
extinct outcomes: {num_dead_populations}
extinction probability: {extinction_probability:.4f}
expected probability: {expected_probability:.4f}
delta from answer: {delta:.4f}""".format(
extinction_probability=num_dead_populations/num_trials,
expected_probability=2**.5 - 1,
delta=(num_dead_populations/num_trials) - (2**.5 - 1),
**locals()
))
if __name__ == '__main__':
main()Context
StackExchange Code Review Q#127592, answer score: 4
Revisions (0)
No revisions yet.