patternpythonMinor

Calculating Euclidean norm for each vector in a sparse matrix

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

matrixeacheuclideancalculatingfornormsparsevector

Problem

Below is a naive algorithm to find nearest neighbours for a point in some n-dimensional space.

import numpy as np
import scipy.sparse as ss

# brute force method to get k nearest neighbours to point p
def knn_naive(k, p, X):
    stacked_p = ss.vstack([p for n in range(X.get_shape()[0])])
    D = X - stacked_p
    D = np.sqrt(D.multiply(D).sum(1))
    result = sorted(D)[:k]
    return result

Types for arguments to knn_naive(...) are:

# type(k): int
# type(p): 
# type(X):

An example of dimensions would be:

X.shape: (100, 3004)
p.shape: (1, 3004)

Are there techniques to improve the above implementation in terms of computational time and/or memory?

Solution

I wasn't aware that scipy's sparse matrices did not support broadcasting. What a shame that you can't simply do D = X - p! Your construction of stacked_p can be sped up quite a bit, but you need to access the internals of the CSR matrix directly. If you aren't aware of the details, the wikipedia article has a good description of the format.

In any case, you can construct stacked_p as follows:

rows, cols = X.shape
stacked_p = ss.csr_matrix((np.tile(p.data, rows), np.tile(p.indices, rows),
                           np.arange(0, rows*p.nnz + 1, p.nnz)), shape=X.shape)

With some dummy data:

import numpy as np
import scipy.sparse as sps

p = sps.rand(1, 1000, density=0.01, format='csr')

In [15]: %timeit sps.vstack([p for n in xrange(1000)]).tocsr()
10 loops, best of 3: 173 ms per loop

In [16]: %timeit sps.csr_matrix((np.tile(p.data, 1000), np.tile(p.indices, 1000),
                                 np.arange(0, 1000*p.nnz + 1, p.nnz)), (1000, 1000))
10000 loops, best of 3: 94.1 µs per loop

That is close to 2000x times faster...

Code Snippets

rows, cols = X.shape
stacked_p = ss.csr_matrix((np.tile(p.data, rows), np.tile(p.indices, rows),
                           np.arange(0, rows*p.nnz + 1, p.nnz)), shape=X.shape)

import numpy as np
import scipy.sparse as sps

p = sps.rand(1, 1000, density=0.01, format='csr')

In [15]: %timeit sps.vstack([p for n in xrange(1000)]).tocsr()
10 loops, best of 3: 173 ms per loop

In [16]: %timeit sps.csr_matrix((np.tile(p.data, 1000), np.tile(p.indices, 1000),
                                 np.arange(0, 1000*p.nnz + 1, p.nnz)), (1000, 1000))
10000 loops, best of 3: 94.1 µs per loop

Context

StackExchange Code Review Q#47027, answer score: 3

Revisions (0)

No revisions yet.