patternpythonMinor
Applying a formula to 2D numpy arrays row-wise
Viewed 0 times
arraysnumpywiseapplyingrowformula
Problem
The user has two 2D input arrays
complicated formula to these arrays row-wise to get
where
For the sake of this example only, the complicated formula will be
the dot product, and the "additional data" for the formula will be the
identity matrix. The real application is a lot more complicated.
My question is: How can I express the line
in a cleaner way in Numpy? Speed or memory consumption is not a major concern, but code readability is. I suspect that there is a better way to do it in Numpy.
```
from __future__ import print_function
from functools import partial
import numpy as np
def main():
# Some dummy data just for testing purposes
A = np.array([[-0.486978, 0.810468, 0.325568],
[-0.640856, 0.640856, 0.422618],
[-0.698328, 0.628777, 0.34202 ],
[-0.607665, 0.651641, 0.45399 ]])
B = np.array([[ 0.075083, 0.41022 , -0.908891],
[-0.025583, 0.532392, -0.846111],
[ 0.014998, 0.490579, -0.871268],
[-0.231477, 0.401497, -0.886125]])
S = np.identity(3)
#---------------------------------------------------------------
# The problematic line is below. What is the proper way to
# express this in Numpy?
C = np.fromiter(map(partial(users_formula, S), A, B), dtype=np.float64)
assert np.allclose(C, 0.0, atol=1.0e-6), C
print('Done!')
def users_formula(S, a, b):
# a == A_i, b == B_i
# In the real application, the user gives his complicated
# formula here. The matrix
A and B, and a given matrix S. He wants to apply a complicated formula to these arrays row-wise to get
C. Something like: $$C_i = f(S, A_i, B_i)$$where
f is some complicated function, implemented by the user. That is, the user wants to supply his complicated formula in terms of the row vectors, and whatever additional data is necessary for that formula. The implementation of the formula must be a function. For the sake of this example only, the complicated formula will be
the dot product, and the "additional data" for the formula will be the
identity matrix. The real application is a lot more complicated.
My question is: How can I express the line
C = np.fromiter(map(partial(users_formula, S), A, B), dtype=np.float64)
in a cleaner way in Numpy? Speed or memory consumption is not a major concern, but code readability is. I suspect that there is a better way to do it in Numpy.
```
from __future__ import print_function
from functools import partial
import numpy as np
def main():
# Some dummy data just for testing purposes
A = np.array([[-0.486978, 0.810468, 0.325568],
[-0.640856, 0.640856, 0.422618],
[-0.698328, 0.628777, 0.34202 ],
[-0.607665, 0.651641, 0.45399 ]])
B = np.array([[ 0.075083, 0.41022 , -0.908891],
[-0.025583, 0.532392, -0.846111],
[ 0.014998, 0.490579, -0.871268],
[-0.231477, 0.401497, -0.886125]])
S = np.identity(3)
#---------------------------------------------------------------
# The problematic line is below. What is the proper way to
# express this in Numpy?
C = np.fromiter(map(partial(users_formula, S), A, B), dtype=np.float64)
assert np.allclose(C, 0.0, atol=1.0e-6), C
print('Done!')
def users_formula(S, a, b):
# a == A_i, b == B_i
# In the real application, the user gives his complicated
# formula here. The matrix
Solution
Here's some alternatives using your arrays:
Yours, for reference:
List comprehension with row indexing
Make the row indexing a bit more explicit - longer, but clearer
Replace indexing with good old Python zip (this my personal favorite for readability).
Even though I'm generally familiar with
There some other functions to explore, such as
The problem with
This would iterate on the rows of
Internally,
Yours, for reference:
In [19]: np.fromiter(map(partial(users_formula, S), A, B), dtype=np.float64)Out[19]:
array([ 5.88698000e-07, -1.11998000e-07, 1.87179000e-07,
4.89032000e-07])List comprehension with row indexing
In [20]: np.array([users_formula(S,A[i],B[i]) for i in range(A.shape[0])])
Out[20]:
array([ 5.88698000e-07, -1.11998000e-07, 1.87179000e-07,
4.89032000e-07])Make the row indexing a bit more explicit - longer, but clearer
In [21]: np.array([users_formula(S,A[i,:],B[i,:]) for i in range(A.shape[0])])
Out[21]:
array([ 5.88698000e-07, -1.11998000e-07, 1.87179000e-07,
4.89032000e-07])Replace indexing with good old Python zip (this my personal favorite for readability).
In [22]: np.array([users_formula(S,a,b) for a,b in zip(A,B)])
Out[22]:
array([ 5.88698000e-07, -1.11998000e-07, 1.87179000e-07,
4.89032000e-07])Even though I'm generally familiar with
apply_along_axis I'm having problems with its application. Even if I get it right, that's not a good sign. It may be the most compact, but it clearly won't be the clearest.In [23]: np.apply_along_axis(partial(users_formula,S),0,A,B)------------------ ...
ValueError: matrices are not alignedThere some other functions to explore, such as
np.vectorize and np.frompyfunc. But they'll have the same problem - I'd have to study the docs and experiment to get a working example.The problem with
apply_along_axis is that it's designed to iterate over one array, not several.apply_along_axis(func1d,axis,arr,*args)
apply_along_axis(...,0, A, B)This would iterate on the rows of
A, but use the whole B. S could be passed as *args. But to use both A and B, I'd have to concatenate them into one array, and then change your function to handle 'rows' from that. MESSY.Internally,
apply_along_axis is just a generalization of:outarray=np.empty(A.shape[0],A.dtype)
for i in range(A.shape[0]):
outarray[i] = users_formula(S,A[i,:],B[i,:])Code Snippets
In [19]: np.fromiter(map(partial(users_formula, S), A, B), dtype=np.float64)Out[19]:
array([ 5.88698000e-07, -1.11998000e-07, 1.87179000e-07,
4.89032000e-07])In [20]: np.array([users_formula(S,A[i],B[i]) for i in range(A.shape[0])])
Out[20]:
array([ 5.88698000e-07, -1.11998000e-07, 1.87179000e-07,
4.89032000e-07])In [21]: np.array([users_formula(S,A[i,:],B[i,:]) for i in range(A.shape[0])])
Out[21]:
array([ 5.88698000e-07, -1.11998000e-07, 1.87179000e-07,
4.89032000e-07])In [22]: np.array([users_formula(S,a,b) for a,b in zip(A,B)])
Out[22]:
array([ 5.88698000e-07, -1.11998000e-07, 1.87179000e-07,
4.89032000e-07])In [23]: np.apply_along_axis(partial(users_formula,S),0,A,B)------------------ ...
ValueError: matrices are not alignedContext
StackExchange Code Review Q#109479, answer score: 4
Revisions (0)
No revisions yet.