patternpythonMinor
Fastest way to count non-zero pixels using Python and Pillow
Viewed 0 times
nonwaypillowpixelsfastestusingpythonandcountzero
Problem
I have a Python script that creates a diff of two images using PIL. That part works fine. Now I need to find an efficient way to count all the non-black pixels (which represent parts of the two images that are different). The diff image is in RGB mode.
My initial cut was something like this:
Then I realized that the diffs were usually constrained to a portion of the image, so I used
This has the advantage of being VERY fast when the image is all black since
I still wasn't satisfied, so I decided to try using more of the built-in PIL methods to avoid the generator expression with the Python conditional that needed to be evaluated for each pixel. I came up with:
This is about five times faster than the previous version. The basic steps are:
Can I do b
My initial cut was something like this:
return sum(x != (0, 0, 0) for x in diffimage.getdata())Then I realized that the diffs were usually constrained to a portion of the image, so I used
getbbox() to find the actual diff data:bbox = diffimage.getbbox()
return sum(x != (0, 0, 0) for x in diffimage.crop(bbox).getdata()) if bbox else 0This has the advantage of being VERY fast when the image is all black since
bbox is None in that case and no pixel counting need be done.I still wasn't satisfied, so I decided to try using more of the built-in PIL methods to avoid the generator expression with the Python conditional that needed to be evaluated for each pixel. I came up with:
bbox = diffimage.getbbox()
if not bbox: return 0
return sum(diffimage.crop(bbox)
.point(lambda x: 255 if x else 0)
.convert("L")
.point(bool)
.getdata())This is about five times faster than the previous version. The basic steps are:
- Crop to the bounding box to avoid counting black pixels
- Convert all non-zero values in each channel to 255. This way, when we later convert it to grayscale, all non-black pixels are guaranteed to have non-zero values. (Because a pixel might differ in only one channel, and only by a small amount, some pixels that are not actually black might end up as black in grayscale mode, because only a fraction of that channel's value makes its way to grayscale.) BTW, the function isn't evaluated for each pixel but only once for each possible pixel value to make a lookup table, so it's very fast.
- Convert to grayscale.
- Convert all non-zero pixels to 1 using
bool.
- Sum all the pixel values.
Can I do b
Solution
Here's your implementation using Pillow:
And here's an implementation using Numpy:
(We will need
Here's a quick performance comparison using
So Numpy is about two and a half times as fast on my example.
def count_nonblack_pil(img):
"""Return the number of pixels in img that are not black.
img must be a PIL.Image object in mode RGB.
"""
bbox = img.getbbox()
if not bbox: return 0
return sum(img.crop(bbox)
.point(lambda x: 255 if x else 0)
.convert("L")
.point(bool)
.getdata())And here's an implementation using Numpy:
def count_nonblack_np(img):
"""Return the number of pixels in img that are not black.
img must be a Numpy array with colour values along the last axis.
"""
return img.any(axis=-1).sum()(We will need
scipy.ndimage.imread to load the image.)Here's a quick performance comparison using
timeit.timeit:>>> from PIL import Image
>>> import scipy.ndimage
>>> from timeit import timeit
>>> img1 = Image.open(filename)
>>> timeit(lambda:count_nonblack_pil(img1), number=10)
5.4229461060022
>>> img2 = scipy.ndimage.imread(filename)
>>> timeit(lambda:count_nonblack_np(img2), number=10)
2.3291947869875003So Numpy is about two and a half times as fast on my example.
Code Snippets
def count_nonblack_pil(img):
"""Return the number of pixels in img that are not black.
img must be a PIL.Image object in mode RGB.
"""
bbox = img.getbbox()
if not bbox: return 0
return sum(img.crop(bbox)
.point(lambda x: 255 if x else 0)
.convert("L")
.point(bool)
.getdata())def count_nonblack_np(img):
"""Return the number of pixels in img that are not black.
img must be a Numpy array with colour values along the last axis.
"""
return img.any(axis=-1).sum()>>> from PIL import Image
>>> import scipy.ndimage
>>> from timeit import timeit
>>> img1 = Image.open(filename)
>>> timeit(lambda:count_nonblack_pil(img1), number=10)
5.4229461060022
>>> img2 = scipy.ndimage.imread(filename)
>>> timeit(lambda:count_nonblack_np(img2), number=10)
2.3291947869875003Context
StackExchange Code Review Q#55902, answer score: 8
Revisions (0)
No revisions yet.