patternpythonMinor

Optimize Scipy Sparse Matrix Factorization code for SGD

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

scipysgdoptimizeforcodematrixsparsefactorization

Problem

I'm working on implementing the stochastic gradient descent algorithm for recommender systems (see Funk) using sparse matrices with Scipy.

This is how a first basic implementation looks like:

N = self.model.shape[0] #no of users
M = self.model.shape[1] #no of items
self.p = np.random.rand(N, K)
self.q = np.random.rand(M, K)
rows,cols = self.model.nonzero()        
for step in xrange(steps):
    for u, i in zip(rows,cols):
        e=self.model-np.dot(self.p,self.q.T) #calculate error for gradient
        p_temp = learning_rate * ( e[u,i] * self.q[i,:] - regularization * self.p[u,:])
        self.q[i,:]+= learning_rate * ( e[u,i] * self.p[u,:] - regularization * self.q[i,:])
        self.p[u,:] += p_temp

I have two questions and I am hoping some seasoned python coders / recommender systems experts here could provide me with some insight into this:

1) How often should the error e be adjusted? It is not clear from Funk's and Koren's papers if it should be once for every rating or if it should be once per every factor. Meaning, should i take it out of the for loop, leave it there, or put it even deeper (over the k iterations) ?

2) Unfortunately, my code is pretty slow, taking about 20 minutes on ONE iteration over a 100k ratings dataset. This means that computing the factorization over the whole dataset takes about 10 hours. I was thinking that this is probably due to the sparse matrix for loop. I've tried expressing the q and p changes using fancy indexing but since I'm still pretty new at scipy and numpy, I couldn't figure a better way to do it. Do you have any pointers on how i could avoid iterating over the rows and columns of the sparse matrix explicitly?

Solution

There may be other optimizations available, but this one alone should go a long way...

When you do e = self.model - np.dot(self.p,self.q.T), the resulting e is a dense matrix the same size as your model. You later only use e[u, i] in your loop, so you are throwing away what probably are millions of computed values. If you simply replace that line with:

e = self.model[u, i] - np.dot(self.p[u, :], self.q[i, :])

and then replace e[u, i] with e, you will spare yourself a huge amount both of computations and memory accesses, which should boost performance by a lot.

Code Snippets

e = self.model[u, i] - np.dot(self.p[u, :], self.q[i, :])

Context

StackExchange Code Review Q#35727, answer score: 5

Revisions (0)

No revisions yet.