patternpythonMinor
Efficient element-wise function computation in Python
Viewed 0 times
computationefficientfunctionwiseelementpython
Problem
I have the following optimization problem:
Given two np.arrays
Here is an implementation using nested for-loops, which are acknowledged to be the slowest to solve these kind of problems.
Given two np.arrays
X,Y and a function K I would like to compute as fast as possible the matrix incidence gram_matrix where the (i,j)-th element is computed as K(X[i],Y[j]). Here is an implementation using nested for-loops, which are acknowledged to be the slowest to solve these kind of problems.
def proxy_kernel(X,Y,K):
gram_matrix = np.zeros((X.shape[0], Y.shape[0]))
for i, x in enumerate(X):
for j, y in enumerate(Y):
gram_matrix[i, j] = K(x, y)
return gram_matrixSolution
You can avoid the nested loops using
For example:
This is about twice as fast as
but it still has to evaluate the slow Python code in
then this is already suitable for applying to whole arrays:
and this is about five times as fast as using
numpy.meshgrid to build a table of entries in x and y, and numpy.vectorize to apply a function to all entries in the table:def tabulate(x, y, f):
"""Return a table of f(x, y)."""
return np.vectorize(f)(*np.meshgrid(x, y, sparse=True))For example:
>>> import operator
>>> tabulate(np.arange(1, 5), np.arange(1, 4), operator.mul)
array([[ 1, 2, 3, 4],
[ 2, 4, 6, 8],
[ 3, 6, 9, 12]])This is about twice as fast as
proxy_kernel:>>> from timeit import timeit
>>> X = Y = np.arange(2000)
>>> timeit(lambda:proxy_kernel(X, Y, operator.mul), number=1)
2.174600816098973
>>> timeit(lambda:tabulate(X, Y, operator.mul), number=1)
0.9889162541367114but it still has to evaluate the slow Python code in
f for each entry in the table. To really speed things up you need to use whole-array operations throughout. Depending on how you've written f, it may be possible to do this directly. For example, if you have something like:def f(x, y):
return np.exp(-0.5 * np.abs(x - y))then this is already suitable for applying to whole arrays:
>>> f(*np.meshgrid(np.arange(4), np.arange(4), sparse=True))
array([[ 1. , 0.60653066, 0.36787944, 0.22313016],
[ 0.60653066, 1. , 0.60653066, 0.36787944],
[ 0.36787944, 0.60653066, 1. , 0.60653066],
[ 0.22313016, 0.36787944, 0.60653066, 1. ]])and this is about five times as fast as using
numpy.vectorize:>>> timeit(lambda:f(*np.meshgrid(X, Y, sparse=True)), number=1)
0.1973482659086585Code Snippets
def tabulate(x, y, f):
"""Return a table of f(x, y)."""
return np.vectorize(f)(*np.meshgrid(x, y, sparse=True))>>> import operator
>>> tabulate(np.arange(1, 5), np.arange(1, 4), operator.mul)
array([[ 1, 2, 3, 4],
[ 2, 4, 6, 8],
[ 3, 6, 9, 12]])>>> from timeit import timeit
>>> X = Y = np.arange(2000)
>>> timeit(lambda:proxy_kernel(X, Y, operator.mul), number=1)
2.174600816098973
>>> timeit(lambda:tabulate(X, Y, operator.mul), number=1)
0.9889162541367114def f(x, y):
return np.exp(-0.5 * np.abs(x - y))>>> f(*np.meshgrid(np.arange(4), np.arange(4), sparse=True))
array([[ 1. , 0.60653066, 0.36787944, 0.22313016],
[ 0.60653066, 1. , 0.60653066, 0.36787944],
[ 0.36787944, 0.60653066, 1. , 0.60653066],
[ 0.22313016, 0.36787944, 0.60653066, 1. ]])Context
StackExchange Code Review Q#90005, answer score: 4
Revisions (0)
No revisions yet.