patternpythonMinor
Optimize Scipy Sparse Matrix Factorization code for SGD
Viewed 0 times
scipysgdoptimizeforcodematrixsparsefactorization
Problem
I'm working on implementing the stochastic gradient descent algorithm for recommender systems (see Funk) using sparse matrices with Scipy.
This is how a first basic implementation looks like:
I have two questions and I am hoping some seasoned python coders / recommender systems experts here could provide me with some insight into this:
1) How often should the error e be adjusted? It is not clear from Funk's and Koren's papers if it should be once for every rating or if it should be once per every factor. Meaning, should i take it out of the for loop, leave it there, or put it even deeper (over the k iterations) ?
2) Unfortunately, my code is pretty slow, taking about 20 minutes on ONE iteration over a 100k ratings dataset. This means that computing the factorization over the whole dataset takes about 10 hours. I was thinking that this is probably due to the sparse matrix for loop. I've tried expressing the q and p changes using fancy indexing but since I'm still pretty new at scipy and numpy, I couldn't figure a better way to do it. Do you have any pointers on how i could avoid iterating over the rows and columns of the sparse matrix explicitly?
This is how a first basic implementation looks like:
N = self.model.shape[0] #no of users
M = self.model.shape[1] #no of items
self.p = np.random.rand(N, K)
self.q = np.random.rand(M, K)
rows,cols = self.model.nonzero()
for step in xrange(steps):
for u, i in zip(rows,cols):
e=self.model-np.dot(self.p,self.q.T) #calculate error for gradient
p_temp = learning_rate * ( e[u,i] * self.q[i,:] - regularization * self.p[u,:])
self.q[i,:]+= learning_rate * ( e[u,i] * self.p[u,:] - regularization * self.q[i,:])
self.p[u,:] += p_tempI have two questions and I am hoping some seasoned python coders / recommender systems experts here could provide me with some insight into this:
1) How often should the error e be adjusted? It is not clear from Funk's and Koren's papers if it should be once for every rating or if it should be once per every factor. Meaning, should i take it out of the for loop, leave it there, or put it even deeper (over the k iterations) ?
2) Unfortunately, my code is pretty slow, taking about 20 minutes on ONE iteration over a 100k ratings dataset. This means that computing the factorization over the whole dataset takes about 10 hours. I was thinking that this is probably due to the sparse matrix for loop. I've tried expressing the q and p changes using fancy indexing but since I'm still pretty new at scipy and numpy, I couldn't figure a better way to do it. Do you have any pointers on how i could avoid iterating over the rows and columns of the sparse matrix explicitly?
Solution
There may be other optimizations available, but this one alone should go a long way...
When you do
and then replace
When you do
e = self.model - np.dot(self.p,self.q.T), the resulting e is a dense matrix the same size as your model. You later only use e[u, i] in your loop, so you are throwing away what probably are millions of computed values. If you simply replace that line with:e = self.model[u, i] - np.dot(self.p[u, :], self.q[i, :])and then replace
e[u, i] with e, you will spare yourself a huge amount both of computations and memory accesses, which should boost performance by a lot.Code Snippets
e = self.model[u, i] - np.dot(self.p[u, :], self.q[i, :])Context
StackExchange Code Review Q#35727, answer score: 5
Revisions (0)
No revisions yet.