patternpythonMinor
Using the SSIM method for large images
Viewed 0 times
themethodlargeforusingimagesssim
Problem
I'm try to implement the SSIM method. This method has already been implemented in Python (source code), but my goal is implement it with using only Python and NumPy.
My goal is also to use this method on big images (1024x1024 and above). But
```
import numpy
def filter2(window, x):
range1 = x.shape[0] - window.shape[0] + 1
range2 = x.shape[1] - window.shape[1] + 1
res = numpy.zeros((range1, range2), dtype=numpy.double)
x1 = as_strided(x,((x.shape[0] - 10)/1 ,(x.shape[1] - 10)/1 ,11,11), (x.strides[0]1,x.strides[1]1,x.strides[0],x.strides[1])) * window
for i in xrange(range1):
for j in xrange(range2):
res[i,j] = x1[i,j].sum()
return res
def ssim(img1, img2):
window = numpy.array([\
[0.0000, 0.0000, 0.0000, 0.0001, 0.0002, 0.0003, 0.0002, 0.0001, 0.0000, 0.0000, 0.0000],\
[0.0000, 0.0001, 0.0003, 0.0008, 0.0016, 0.0020, 0.0016, 0.0008, 0.0003, 0.0001, 0.0000],\
[0.0000, 0.0003, 0.0013, 0.0039, 0.0077, 0.0096, 0.0077, 0.0039, 0.0013, 0.0003, 0.0000],\
[0.0001, 0.0008, 0.0039, 0.0120, 0.0233, 0.0291, 0.0233, 0.0120, 0.0039, 0.0008, 0.0001],\
[0.0002, 0.0016, 0.0077, 0.0233, 0.0454, 0.0567, 0.0454, 0.0233, 0.0077, 0.0016, 0.0002],\
[0.0003, 0.0020, 0.0096, 0.0291, 0.0567, 0.0708, 0.0567, 0.0291, 0.0096, 0.0020, 0.0003],\
[0.0002, 0.0016, 0.0077, 0.0233, 0.0454, 0.0567, 0.0454, 0.0233, 0.0077, 0.0016, 0.0002],\
[0.0001, 0.0008, 0.0039, 0.0120, 0.0233, 0.0291, 0.0233, 0.0120, 0.0039, 0.0008, 0.0001],\
[0.0000, 0.0003, 0.0013, 0.0039, 0.0077, 0.0096, 0.0077, 0.0039, 0.0013, 0.0003, 0.0000],\
[0.0000, 0.0001, 0.0003, 0.0008, 0.0016, 0.0020, 0.0016, 0.0008, 0.0003, 0.0001, 0.0000],\
[0.0000, 0.0000, 0.0000, 0.0001, 0.0002, 0
My goal is also to use this method on big images (1024x1024 and above). But
filter2 works very slow (approx. 62 sec. for 1024x1024). cProfile gives me information that _methods.py:16(_sum), fromnumeric.py:1422(sum), and method 'reduce' of 'numpy.ufunc' objects eat main part time of run.```
import numpy
def filter2(window, x):
range1 = x.shape[0] - window.shape[0] + 1
range2 = x.shape[1] - window.shape[1] + 1
res = numpy.zeros((range1, range2), dtype=numpy.double)
x1 = as_strided(x,((x.shape[0] - 10)/1 ,(x.shape[1] - 10)/1 ,11,11), (x.strides[0]1,x.strides[1]1,x.strides[0],x.strides[1])) * window
for i in xrange(range1):
for j in xrange(range2):
res[i,j] = x1[i,j].sum()
return res
def ssim(img1, img2):
window = numpy.array([\
[0.0000, 0.0000, 0.0000, 0.0001, 0.0002, 0.0003, 0.0002, 0.0001, 0.0000, 0.0000, 0.0000],\
[0.0000, 0.0001, 0.0003, 0.0008, 0.0016, 0.0020, 0.0016, 0.0008, 0.0003, 0.0001, 0.0000],\
[0.0000, 0.0003, 0.0013, 0.0039, 0.0077, 0.0096, 0.0077, 0.0039, 0.0013, 0.0003, 0.0000],\
[0.0001, 0.0008, 0.0039, 0.0120, 0.0233, 0.0291, 0.0233, 0.0120, 0.0039, 0.0008, 0.0001],\
[0.0002, 0.0016, 0.0077, 0.0233, 0.0454, 0.0567, 0.0454, 0.0233, 0.0077, 0.0016, 0.0002],\
[0.0003, 0.0020, 0.0096, 0.0291, 0.0567, 0.0708, 0.0567, 0.0291, 0.0096, 0.0020, 0.0003],\
[0.0002, 0.0016, 0.0077, 0.0233, 0.0454, 0.0567, 0.0454, 0.0233, 0.0077, 0.0016, 0.0002],\
[0.0001, 0.0008, 0.0039, 0.0120, 0.0233, 0.0291, 0.0233, 0.0120, 0.0039, 0.0008, 0.0001],\
[0.0000, 0.0003, 0.0013, 0.0039, 0.0077, 0.0096, 0.0077, 0.0039, 0.0013, 0.0003, 0.0000],\
[0.0000, 0.0001, 0.0003, 0.0008, 0.0016, 0.0020, 0.0016, 0.0008, 0.0003, 0.0001, 0.0000],\
[0.0000, 0.0000, 0.0000, 0.0001, 0.0002, 0
Solution
In your strided
In my tests this gives a 6x speed improvement.
With
Traditionally Matlab had the same speed problems, but newer versions recognize and compile loops like yours. That's why your numpy is so much slower.
filter2, x1 is (1014, 1014, 11, 11). You are iterating over the 1st 2 dimensions in order to sum on on the last 2. Let sum do all the work for you, res = x1.sum((2,3))def filter2(window, x):
range1 = x.shape[0] - window.shape[0] + 1
range2 = x.shape[1] - window.shape[1] + 1
x1 = as_strided(x,((x.shape[0] - 10)/1 ,(x.shape[1] - 10)/1 ,11,11), (x.strides[0]*1,x.strides[1]*1,x.strides[0],x.strides[1])) * window
res = x1.sum((2,3))
return resIn my tests this gives a 6x speed improvement.
With
numpy iteration, especially nested ones over large dimensions like 1014 is a speed killer. You want to vectorize this kind of thing as much as possible.Traditionally Matlab had the same speed problems, but newer versions recognize and compile loops like yours. That's why your numpy is so much slower.
Code Snippets
def filter2(window, x):
range1 = x.shape[0] - window.shape[0] + 1
range2 = x.shape[1] - window.shape[1] + 1
x1 = as_strided(x,((x.shape[0] - 10)/1 ,(x.shape[1] - 10)/1 ,11,11), (x.strides[0]*1,x.strides[1]*1,x.strides[0],x.strides[1])) * window
res = x1.sum((2,3))
return resContext
StackExchange Code Review Q#31089, answer score: 2
Revisions (0)
No revisions yet.