patternpythonMinor
Faster solution for row-wise matrix subtraction
Viewed 0 times
wisesubtractionfasterforsolutionmatrixrow
Problem
I have 2 matrices. I have to calculate the euclidean distance between every row of matrix A and every row of matrix B.
In the first solution I loop over the rows of the two matrices and for each row in matrix A, I take a row from B, take the square of the element wise subtraction, then sum it and take the square root.
I end up with a matrix of the size #rows in A x #rows in B:
In the second solution I did this by broadcasting, and it works. But when I increase the amount of rows in A and B ... it becomes very very slow. Is there a faster way without looping?
Solution 2:
In the first solution I loop over the rows of the two matrices and for each row in matrix A, I take a row from B, take the square of the element wise subtraction, then sum it and take the square root.
I end up with a matrix of the size #rows in A x #rows in B:
import numpy as np
A = np.random.random((50, 3072))
B = np.random.random((500, 3072))
# First solution
d = np.zeros((50, 500))
for i in range(50):
for j in range (500):
d[i,j] = np.sqrt(np.sum(np.square(B[j] - A[i])))In the second solution I did this by broadcasting, and it works. But when I increase the amount of rows in A and B ... it becomes very very slow. Is there a faster way without looping?
Solution 2:
# Second solution
test = np.sqrt(np.sum(np.square(A[:,np.newaxis]-B),axis=2))
#print check
print np.unique(test==d)Solution
It's not pretty, but this gives a factor-3 speed improvement:
This is based off of the fact
$$
(a - b)^2 = a^2 + b^2 - 2ab
$$
and so, ignoring the fudging with indices,
$$
\sum(a - b)^2 = \sum a^2 + \sum b^2 - 2\sum ab
$$
The squared terms are just
and \$\sum ab = \vec A \cdot \vec B\$. This can be broadcast with a bit of fudging the axes:
d = (A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)
d -= 2 * np.squeeze(A.dot(B[..., np.newaxis]), axis=-1)
d **= 0.5This is based off of the fact
$$
(a - b)^2 = a^2 + b^2 - 2ab
$$
and so, ignoring the fudging with indices,
$$
\sum(a - b)^2 = \sum a^2 + \sum b^2 - 2\sum ab
$$
The squared terms are just
(A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)and \$\sum ab = \vec A \cdot \vec B\$. This can be broadcast with a bit of fudging the axes:
np.squeeze(A.dot(B[..., np.newaxis]), axis=-1)Code Snippets
d = (A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)
d -= 2 * np.squeeze(A.dot(B[..., np.newaxis]), axis=-1)
d **= 0.5(A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)np.squeeze(A.dot(B[..., np.newaxis]), axis=-1)Context
StackExchange Code Review Q#77245, answer score: 7
Revisions (0)
No revisions yet.