patternpythonMinor
sparse vector dot product
Viewed 0 times
productsparsevectordot
Problem
I'm trying to implement a sparse vector (most elements are zero) dot product calculation. My major idea is to represent each sparse vector as a list (which holds only non-zero dimensions), and each element in the list is a 2-dimensional tuple -- where first dimension is index of vector, and 2nd dimension is its related value.
Any comments about my code improvement, bugs, etc. are appreciated. I also welcome any good advice for in general how to best implement dot product for sparse vector using Python.
Any comments about my code improvement, bugs, etc. are appreciated. I also welcome any good advice for in general how to best implement dot product for sparse vector using Python.
def dot_product(v1, v2):
i1 = 0
i2 = 0
result = 0.0
while i1 v2[i2][0]:
i2 += 1
else:
i1 += 1
return result
if __name__ == "__main__":
v1 = [(0, 2), (1, 4), (5, 6)] # (2,4,0,0,0,6)
v2 = [(1,3),(2,4),(5,7)] #(0,3,4,0,0,7)
print dot_product(v1, v2)Solution
Valid. But your code is messy. You are taking one step away from the original traditional problem, and that actually becomes a problem form your code.
Your actually better off with the one dimensional arrays then the two dimensional tuples. Tuples may be the wrong data type to use for your problem to resolve itself nicely.
Styling notes:
This:
suggest to me that v1 and v2 are vectors, not a list of vectors, but by changing to:
the obscurity magically disappears.
This method of commenting your code is called annotations and is commonly used in python, I like them a lot.
The
could be more readable if:
An analogue solution which will scale better is
and the function sparse_vector_dot could be written as:
It will scale better since we are never checking what we don't have to check, the only thing we have to check is if there is a corresponding index in the other list (now dict).
Your actually better off with the one dimensional arrays then the two dimensional tuples. Tuples may be the wrong data type to use for your problem to resolve itself nicely.
Styling notes:
This:
def dot_product(v1, v2):
...suggest to me that v1 and v2 are vectors, not a list of vectors, but by changing to:
def dot_product1(v1: [()], v2: [()]):
...the obscurity magically disappears.
This method of commenting your code is called annotations and is commonly used in python, I like them a lot.
The
...
return result
if __name__ == "__main__":
...could be more readable if:
...
return result
if __name__ == "__main__":
...An analogue solution which will scale better is
def get_pointers(v: tuple):
return {key: value for key, value in enumerate(v) if value}
def sparse_vector_dot(a: dict, b: dict):
result = 0
for key, value in a.items():
if key in b:
result += b[key] * value
return result
def main():
a = (2, 4, 0, 0, 0, 6)
b = (0, 3, 4, 0, 0, 7)
a, b = get_pointers(a), get_pointers(b)
print(sparse_vector_dot(a, b))and the function sparse_vector_dot could be written as:
def sparse_vector_dot_oneliner(a: dict, b: dict):
return sum(b[key]*value for key, value in a.items() if key in b)It will scale better since we are never checking what we don't have to check, the only thing we have to check is if there is a corresponding index in the other list (now dict).
Code Snippets
def dot_product(v1, v2):
...def dot_product1(v1: [()], v2: [()]):
......
return result
if __name__ == "__main__":
......
return result
if __name__ == "__main__":
...def get_pointers(v: tuple):
return {key: value for key, value in enumerate(v) if value}
def sparse_vector_dot(a: dict, b: dict):
result = 0
for key, value in a.items():
if key in b:
result += b[key] * value
return result
def main():
a = (2, 4, 0, 0, 0, 6)
b = (0, 3, 4, 0, 0, 7)
a, b = get_pointers(a), get_pointers(b)
print(sparse_vector_dot(a, b))Context
StackExchange Code Review Q#148717, answer score: 4
Revisions (0)
No revisions yet.