patternpythonMinor
Lattice lotka-volterra
Viewed 0 times
latticelotkavolterra
Problem
Can anyone comment on the efficiency of my code? I am new with Python using NumPy/SciPy and I am trying to program a lattice biological model where a certain (one and only one) species (represented by a nonzero element) occupies a lattice square (represented in the code by a 2D array). This species can reproduce to an adjacent and empty lattice square (represented by a zero element). (Quite similar to a lattice gas model.)
The part of the code that takes the longest time is the reproduction part (
I am thinking of creating a collection of indices (2D index) where I can choose a random element and delete it at the same time so that the
```
import numpy as np
import matplotlib.pyplot as plt
import random
import time
import csv
open('p14.4.csv','w').close()
t0 = time.time()
def dispose(matrix,dens,s,sp):
limit = int(nndens)
for num in range(limit):
i = np.random.randint(n)
j = np.random.randint(n)
while matrix[i,j] != 0:
i = np.random.randint(n)
j = np.random.randint(n)
matrix[i,j] = sp
return matrix
def mort(matrix,d):
RN = np.random.random((n,n))
casualty = (RN n-1) or (j+qn-1)):
p = np.random.randint(-1,2)
q = np.random.randint(-1,2)
if matrix[i+p,j+q] == 0 and b[spp-1] > np.random.random():
matrix[i+p,j+q] = matrix[i,j]
return matrix
def csv_writer(data,path,line):
with open(path, "a") as csv_file:
writer = csv.writer(csv_file, delimiter=',')
writer.writerow(data)
csv_file.close()
return path
#birth rate
spp = 10
B = [0.57710226,0.57722738,0.577233935,0.577120507,0.576885731,0.576528293,
0.576046935,0.575440457,0.574
The part of the code that takes the longest time is the reproduction part (
def repro) because it has to check whether a site is already chosen every time a random site is generated to avoid repetition (the first while loop in def repro). Can anyone suggest a faster way to do this?I am thinking of creating a collection of indices (2D index) where I can choose a random element and delete it at the same time so that the
while loop won't be necessary. How can I do this?```
import numpy as np
import matplotlib.pyplot as plt
import random
import time
import csv
open('p14.4.csv','w').close()
t0 = time.time()
def dispose(matrix,dens,s,sp):
limit = int(nndens)
for num in range(limit):
i = np.random.randint(n)
j = np.random.randint(n)
while matrix[i,j] != 0:
i = np.random.randint(n)
j = np.random.randint(n)
matrix[i,j] = sp
return matrix
def mort(matrix,d):
RN = np.random.random((n,n))
casualty = (RN n-1) or (j+qn-1)):
p = np.random.randint(-1,2)
q = np.random.randint(-1,2)
if matrix[i+p,j+q] == 0 and b[spp-1] > np.random.random():
matrix[i+p,j+q] = matrix[i,j]
return matrix
def csv_writer(data,path,line):
with open(path, "a") as csv_file:
writer = csv.writer(csv_file, delimiter=',')
writer.writerow(data)
csv_file.close()
return path
#birth rate
spp = 10
B = [0.57710226,0.57722738,0.577233935,0.577120507,0.576885731,0.576528293,
0.576046935,0.575440457,0.574
Solution
There's quite a lot of code here, so for now I'm just going to look at the
-
There's no docstring. What does this function do? How do I call it? What values do I pass for parameters? What does it return? It's very hard to maintain code when you don't know what it is supposed to do. This doesn't have to be long-winded or laborious: a quick explanation like the following is perfectly adequate.
-
There's no good reason to abbreviate the parameter to
-
The function takes a parameter
-
The function uses the global variable
-
The name
-
The function always returns
-
The name
-
The name
-
The value 0 is arbitrary and so for extra generality it should be a function parameter (taking the default value 0). Imagine having a "speciation" operation in which you replace some fraction of a species with a descendant. It would be nice to be able to reuse this code.
-
If
Even if
Perhaps you plan not to fill very much of the array, but even so it is a good idea to write robust code in case you change your plans or want to reuse this code in other programs.
-
The way that the positions are found is to repeatedly generate random coordinates and check to see if the position is 0. This is inefficient. Imagine that
Perhaps you plan not to fill very much of the array, but again, it is a good idea to write code with the right asymptotic complexity, in case you change your plans or want to reuse this code in other programs.
So it would be better to find the positions containing 0 (using
Putting all this together, I'd write the function like this:
Note the use
However, this has unacceptably poor performance compared to
The reason for this poor performance is that
So what about sampling with replacement instead (which is fast), eliminating duplicates using
```
def populate(array, density, value, original=0):
"""Choose array.size * density positions in array that are equal to
original, uniformly at random, and
dispose function. You'll see that there's more than enough to say about this function for one answer.- Review
-
There's no docstring. What does this function do? How do I call it? What values do I pass for parameters? What does it return? It's very hard to maintain code when you don't know what it is supposed to do. This doesn't have to be long-winded or laborious: a quick explanation like the following is perfectly adequate.
"""Choose n*n*dens positions in matrix with the value 0, uniformly at
random, and set them to sp. Return matrix."""-
There's no good reason to abbreviate the parameter to
dens. It only appears once in the function body, and we're not running out of letters, so call it density.-
The function takes a parameter
s but this is not used. Better to omit it.-
The function uses the global variable
n. It would be better to pass this in as a parameter, or even better, deduce it from the size of matrix. Presumably the intention is that matrix is always an n × n array, in which case you should write matrix.size instead of n * n.-
The name
dispose is vague and does not convey much to me. I think the name populate would be better.-
The function always returns
matrix, which is useless, as the caller already knows about matrix.-
The name
matrix is slightly misleading, as Numpy has a matrix class but this parameter does not belong to it! It's actually an ndarray. I would call the parameter array or maybe just a.-
The name
sp is rather too specific. Presumably it is short for "species", but really nothing about this function depends on this value representing a species. You can imagine other types of lattice simulation where the array entries represent molecules, grains of sand, people, etc. So for generality I'd name this parameter value.-
The value 0 is arbitrary and so for extra generality it should be a function parameter (taking the default value 0). Imagine having a "speciation" operation in which you replace some fraction of a species with a descendant. It would be nice to be able to reuse this code.
-
If
dens is greater than 1, then limit will be greater than the number of elements in the array, and so the function will run forever.Even if
dens is less than 1, there might still not be enough positions in the array with value 0, and so the function will run forever.Perhaps you plan not to fill very much of the array, but even so it is a good idea to write robust code in case you change your plans or want to reuse this code in other programs.
-
The way that the positions are found is to repeatedly generate random coordinates and check to see if the position is 0. This is inefficient. Imagine that
dens is 1, so that all positions must be set to sp. How long will this take, on average? This is the well known "coupon collector's problem" so the expected number of coordinates to be randomly generated is \$ n^2 \log n^2 \$.Perhaps you plan not to fill very much of the array, but again, it is a good idea to write code with the right asymptotic complexity, in case you change your plans or want to reuse this code in other programs.
So it would be better to find the positions containing 0 (using
numpy.flatnonzero), then choose a random sample of those positions without replacement (using numpy.random.choice), and then set all the positions.- Performance
Putting all this together, I'd write the function like this:
def populate(array, density, value, original=0):
"""Choose array.size * density positions in array that are equal to
original, uniformly at random, and set them to value.
"""
a = array.reshape(-1)
count = int(a.size * density)
a[np.random.choice(np.flatnonzero(a == original), count, False)] = valueNote the use
numpy.reshape to create a flattened view of array, so that we can use numpy.random.choice, which requires a 1-D array. We then use this view to update the original array.However, this has unacceptably poor performance compared to
dispose when density is small:>>> a = np.zeros((1000, 1000))
>>> timeit(lambda:dispose(a, 0.01, None, 1), number=1)
0.0184222329990007
>>> timeit(lambda:populate(a, 0.01, 1), number=1)
0.2894051409966778The reason for this poor performance is that
numpy.random.choice computes a sample without replacement by permuting the whole array and using an initial prefix of the permutation. This is expensively wasteful when only a small part of the permutation is needed.So what about sampling with replacement instead (which is fast), eliminating duplicates using
numpy.unique and looping until we have enough choices (as in the original post)? Here's the implementation:```
def populate(array, density, value, original=0):
"""Choose array.size * density positions in array that are equal to
original, uniformly at random, and
Code Snippets
"""Choose n*n*dens positions in matrix with the value 0, uniformly at
random, and set them to sp. Return matrix."""def populate(array, density, value, original=0):
"""Choose array.size * density positions in array that are equal to
original, uniformly at random, and set them to value.
"""
a = array.reshape(-1)
count = int(a.size * density)
a[np.random.choice(np.flatnonzero(a == original), count, False)] = value>>> a = np.zeros((1000, 1000))
>>> timeit(lambda:dispose(a, 0.01, None, 1), number=1)
0.0184222329990007
>>> timeit(lambda:populate(a, 0.01, 1), number=1)
0.2894051409966778def populate(array, density, value, original=0):
"""Choose array.size * density positions in array that are equal to
original, uniformly at random, and set them to value.
"""
a = array.reshape(-1)
count = int(a.size * density)
while count:
p = np.random.choice(a.size, count)
p = np.unique(p[a[p] == original])
a[p] = value
count -= p.size>>> a = np.zeros((1000,1000))
>>> timeit(lambda:populate(a, 0.01, 1), number=1)
0.0019981740042567253Context
StackExchange Code Review Q#84742, answer score: 2
Revisions (0)
No revisions yet.