patternpythonMinor
"Histogram of Oriented Gradients" (HOG) feature detector for computer vision
Viewed 0 times
detectorhogvisionhistogramgradientsforcomputerorientedfeature
Problem
Here is a function that I've only slightly modified from its original context, found here.
Before mentioning anything else, it should be noted that I'm desperately trying to optimize this code for speed. It presently takes about 5.25 seconds to execute and it appears as though the bottleneck is happening in the
In a nutshell, this function expects the user to have
Does anybody have some clever approach for speeding things up? Ideally I'd be able to run this on a real-time webcam feed at 30 frames-per-second, but I'm not getting my hopes up.
```
from itertools import product
from math import floor, pi
import numpy as np
import cv2 # opencv 2
def findHOGFeatures(img, n_divs=6, n_bins=6):
"""
SUMMARY
Get HOG(Histogram of Oriented Gradients) features from the image.
PARAMETERS
img* - SimpleCV.Image instance
n_divs* - the number of divisions(cells).
n_divs* - the number of orientation bins.
RETURNS
Returns the HOG vector in a numpy array
"""
# Size of HOG vector
n_HOG = n_divs n_divs n_bins
# Initialize output HOG vector
# HOG = [0.0]*n_HOG
HOG = np.zeros((n_HOG, 1))
# Apply sobel on image to find x and y orientations of the image
Icv = img.getNumpyCv2()
Ix = cv2.Sobel(Icv, ddepth=cv.CV_32F, dx=1, dy=0, ksize=3)
Iy = cv2.Sobel(Icv, ddepth=cv.CV_32F, dx=0, dy=1, ksize=3)
Ix = Ix.transpose(1, 0, 2)
Iy = Iy.transpose(1, 0, 2)
cellx = img.width / n_divs # width of each cell(division)
celly = img.height / n_divs # height of each cell(division)
#Area of image
img_area = img.height * img.width
#Range of each bin
BIN_RANGE = (2 * pi) / n_bins
# m = 0
angles = np.arctan2(Iy, Ix)
magnit = ((Ix 2) + (Iy 2)) ** 0.5
it = product(xrange(n_divs), xrange(n_divs), xrange(cellx), xrange(celly))
for m, n, i, j in
Before mentioning anything else, it should be noted that I'm desperately trying to optimize this code for speed. It presently takes about 5.25 seconds to execute and it appears as though the bottleneck is happening in the
for-loop.In a nutshell, this function expects the user to have
SimpleCV installed and expects, at a minimum, to be passed a SimpleCV.Image instance.Does anybody have some clever approach for speeding things up? Ideally I'd be able to run this on a real-time webcam feed at 30 frames-per-second, but I'm not getting my hopes up.
```
from itertools import product
from math import floor, pi
import numpy as np
import cv2 # opencv 2
def findHOGFeatures(img, n_divs=6, n_bins=6):
"""
SUMMARY
Get HOG(Histogram of Oriented Gradients) features from the image.
PARAMETERS
img* - SimpleCV.Image instance
n_divs* - the number of divisions(cells).
n_divs* - the number of orientation bins.
RETURNS
Returns the HOG vector in a numpy array
"""
# Size of HOG vector
n_HOG = n_divs n_divs n_bins
# Initialize output HOG vector
# HOG = [0.0]*n_HOG
HOG = np.zeros((n_HOG, 1))
# Apply sobel on image to find x and y orientations of the image
Icv = img.getNumpyCv2()
Ix = cv2.Sobel(Icv, ddepth=cv.CV_32F, dx=1, dy=0, ksize=3)
Iy = cv2.Sobel(Icv, ddepth=cv.CV_32F, dx=0, dy=1, ksize=3)
Ix = Ix.transpose(1, 0, 2)
Iy = Iy.transpose(1, 0, 2)
cellx = img.width / n_divs # width of each cell(division)
celly = img.height / n_divs # height of each cell(division)
#Area of image
img_area = img.height * img.width
#Range of each bin
BIN_RANGE = (2 * pi) / n_bins
# m = 0
angles = np.arctan2(Iy, Ix)
magnit = ((Ix 2) + (Iy 2)) ** 0.5
it = product(xrange(n_divs), xrange(n_divs), xrange(cellx), xrange(celly))
for m, n, i, j in
Solution
As you indicated in the question, you need to vectorize the
If you look at what this is doing, you're effectively labelling every pixel in the
Operations on labelled regions of images are jobs for the
Notes:
-
I've used
which is portable between Python 2 and Python 3, and simpler than my first attempt:
for loop:it = product(xrange(n_divs), xrange(n_divs), xrange(cellx), xrange(celly))
for m, n, i, j in it:
# grad value
grad = magnit[m * cellx + i, n * celly + j][0]
# normalized grad value
norm_grad = grad / img_area
# Orientation Angle
angle = angles[m*cellx + i, n*celly+j][0]
# (-pi,pi) to (0, 2*pi)
if angle < 0:
angle += 2 * pi
nth_bin = floor(float(angle/BIN_RANGE))
HOG[((m * n_divs + n) * n_bins + int(nth_bin))] += norm_gradIf you look at what this is doing, you're effectively labelling every pixel in the
magnit array with a number below n_HOG, and then summing the normalized values for the pixels with each label.Operations on labelled regions of images are jobs for the
scipy.ndimage.measurements module. Here we can use scipy.ndimage.measurements.sum:bins = (angles[...,0] % (2 * pi) / BIN_RANGE).astype(int)
x, y = np.mgrid[:width, :height]
x = x * n_divs // width
y = y * n_divs // height
labels = (x * n_divs + y) * n_bins + bins
index = np.arange(n_HOG)
HOG = scipy.ndimage.measurements.sum(magnit[...,0], labels, index)
return HOG / img_areaNotes:
-
I've used
% (2 * pi) to get the angles in the range [0, 2π). An alternative that's more like your code would be angles[angles
-
I postponed the division by img_area until after the summation, because it looks to me as though in the common case n_HOG is much less than img_area and so it's cheaper to do the division later when there are fewer items. (This means that the results differ very slightly from your code, so bear that in mind when checking.)
-
I measure the vectorized version as being about 60 times faster than your for loop, but it's still not going to be fast enough to run at 30 fps!
-
I've written angles[...,0] and magnit[...,0] here in order to drop the third axis. But I think it would make more sense if you dropped this axis earlier, before computing angles and magnit, by writing Ix = Ix[...,0] or just Ix = Ix.reshape((height, width)) if you know that the last axis has length 1.
Update
Based on comments, it looks as if you are using Python 2.7, where the division operator /` takes the floor of the result if both arguments are integers. So I've changed the code above to use:x = x * n_divs // width
y = y * n_divs // heightwhich is portable between Python 2 and Python 3, and simpler than my first attempt:
x = (x / width * n_divs).astype(int)
y = (y / height * n_divs).astype(int)Code Snippets
it = product(xrange(n_divs), xrange(n_divs), xrange(cellx), xrange(celly))
for m, n, i, j in it:
# grad value
grad = magnit[m * cellx + i, n * celly + j][0]
# normalized grad value
norm_grad = grad / img_area
# Orientation Angle
angle = angles[m*cellx + i, n*celly+j][0]
# (-pi,pi) to (0, 2*pi)
if angle < 0:
angle += 2 * pi
nth_bin = floor(float(angle/BIN_RANGE))
HOG[((m * n_divs + n) * n_bins + int(nth_bin))] += norm_gradbins = (angles[...,0] % (2 * pi) / BIN_RANGE).astype(int)
x, y = np.mgrid[:width, :height]
x = x * n_divs // width
y = y * n_divs // height
labels = (x * n_divs + y) * n_bins + bins
index = np.arange(n_HOG)
HOG = scipy.ndimage.measurements.sum(magnit[...,0], labels, index)
return HOG / img_areax = x * n_divs // width
y = y * n_divs // heightx = (x / width * n_divs).astype(int)
y = (y / height * n_divs).astype(int)Context
StackExchange Code Review Q#42763, answer score: 9
Revisions (0)
No revisions yet.