HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Faster solution for row-wise matrix subtraction

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
wisesubtractionfasterforsolutionmatrixrow

Problem

I have 2 matrices. I have to calculate the euclidean distance between every row of matrix A and every row of matrix B.

In the first solution I loop over the rows of the two matrices and for each row in matrix A, I take a row from B, take the square of the element wise subtraction, then sum it and take the square root.

I end up with a matrix of the size #rows in A x #rows in B:

import numpy as np

A = np.random.random((50, 3072))
B = np.random.random((500, 3072))

# First solution
d = np.zeros((50, 500))
for i in range(50):
    for j in range (500):
        d[i,j] = np.sqrt(np.sum(np.square(B[j] - A[i])))


In the second solution I did this by broadcasting, and it works. But when I increase the amount of rows in A and B ... it becomes very very slow. Is there a faster way without looping?

Solution 2:

# Second solution
test = np.sqrt(np.sum(np.square(A[:,np.newaxis]-B),axis=2))

#print check
print np.unique(test==d)

Solution

It's not pretty, but this gives a factor-3 speed improvement:

d = (A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)
d -= 2 * np.squeeze(A.dot(B[..., np.newaxis]), axis=-1)
d **= 0.5


This is based off of the fact

$$
(a - b)^2 = a^2 + b^2 - 2ab
$$

and so, ignoring the fudging with indices,

$$
\sum(a - b)^2 = \sum a^2 + \sum b^2 - 2\sum ab
$$

The squared terms are just

(A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)


and \$\sum ab = \vec A \cdot \vec B\$. This can be broadcast with a bit of fudging the axes:

np.squeeze(A.dot(B[..., np.newaxis]), axis=-1)

Code Snippets

d = (A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)
d -= 2 * np.squeeze(A.dot(B[..., np.newaxis]), axis=-1)
d **= 0.5
(A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)
np.squeeze(A.dot(B[..., np.newaxis]), axis=-1)

Context

StackExchange Code Review Q#77245, answer score: 7

Revisions (0)

No revisions yet.