patternpythonMinor

"Histogram of Oriented Gradients" (HOG) feature detector for computer vision

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

detectorhogvisionhistogramgradientsforcomputerorientedfeature

Problem

Here is a function that I've only slightly modified from its original context, found here.

Before mentioning anything else, it should be noted that I'm desperately trying to optimize this code for speed. It presently takes about 5.25 seconds to execute and it appears as though the bottleneck is happening in the for-loop.

In a nutshell, this function expects the user to have SimpleCV installed and expects, at a minimum, to be passed a SimpleCV.Image instance.

Does anybody have some clever approach for speeding things up? Ideally I'd be able to run this on a real-time webcam feed at 30 frames-per-second, but I'm not getting my hopes up.

```
from itertools import product
from math import floor, pi
import numpy as np
import cv2 # opencv 2

def findHOGFeatures(img, n_divs=6, n_bins=6):
"""
SUMMARY
Get HOG(Histogram of Oriented Gradients) features from the image.

PARAMETERS
img* - SimpleCV.Image instance
n_divs* - the number of divisions(cells).
n_divs* - the number of orientation bins.

RETURNS
Returns the HOG vector in a numpy array

"""
# Size of HOG vector
n_HOG = n_divs n_divs n_bins

# Initialize output HOG vector
# HOG = [0.0]*n_HOG
HOG = np.zeros((n_HOG, 1))
# Apply sobel on image to find x and y orientations of the image
Icv = img.getNumpyCv2()
Ix = cv2.Sobel(Icv, ddepth=cv.CV_32F, dx=1, dy=0, ksize=3)
Iy = cv2.Sobel(Icv, ddepth=cv.CV_32F, dx=0, dy=1, ksize=3)

Ix = Ix.transpose(1, 0, 2)
Iy = Iy.transpose(1, 0, 2)
cellx = img.width / n_divs # width of each cell(division)
celly = img.height / n_divs # height of each cell(division)

#Area of image
img_area = img.height * img.width

#Range of each bin
BIN_RANGE = (2 * pi) / n_bins

# m = 0
angles = np.arctan2(Iy, Ix)
magnit = ((Ix 2) + (Iy 2)) ** 0.5
it = product(xrange(n_divs), xrange(n_divs), xrange(cellx), xrange(celly))
for m, n, i, j in

Solution

As you indicated in the question, you need to vectorize the for loop:

it = product(xrange(n_divs), xrange(n_divs), xrange(cellx), xrange(celly))
for m, n, i, j in it:
    # grad value
    grad = magnit[m * cellx + i, n * celly + j][0]
    # normalized grad value
    norm_grad = grad / img_area
    # Orientation Angle
    angle = angles[m*cellx + i, n*celly+j][0]
    # (-pi,pi) to (0, 2*pi)
    if angle < 0:
        angle += 2 * pi
    nth_bin = floor(float(angle/BIN_RANGE))
    HOG[((m * n_divs + n) * n_bins + int(nth_bin))] += norm_grad

If you look at what this is doing, you're effectively labelling every pixel in the magnit array with a number below n_HOG, and then summing the normalized values for the pixels with each label.

Operations on labelled regions of images are jobs for the scipy.ndimage.measurements module. Here we can use scipy.ndimage.measurements.sum:

bins = (angles[...,0] % (2 * pi) / BIN_RANGE).astype(int)
x, y = np.mgrid[:width, :height]
x = x * n_divs // width
y = y * n_divs // height
labels = (x * n_divs + y) * n_bins + bins
index = np.arange(n_HOG)
HOG = scipy.ndimage.measurements.sum(magnit[...,0], labels, index)
return HOG / img_area

Notes:

-
I've used % (2 * pi) to get the angles in the range [0, 2π). An alternative that's more like your code would be

angles[angles 

- 
I postponed the division by

img_area until after the summation, because it looks to me as though in the common case n_HOG is much less than img_area

 and so it's cheaper to do the division later when there are fewer items. (This means that the results differ very slightly from your code, so bear that in mind when checking.)

- 
I measure the vectorized version as being about 60 times faster than your

for

 loop, but it's still not going to be fast enough to run at 30 fps!

- 
I've written

angles[...,0] and magnit[...,0]

 here in order to drop the third axis. But I think it would make more sense if you dropped this axis earlier, before computing

angles and magnit, by writing Ix = Ix[...,0] or just Ix = Ix.reshape((height, width))

 if you know that the last axis has length 1.

Update

Based on comments, it looks as if you are using Python 2.7, where the division operator

/` takes the floor of the result if both arguments are integers. So I've changed the code above to use:

x = x * n_divs // width
y = y * n_divs // height

which is portable between Python 2 and Python 3, and simpler than my first attempt:

x = (x / width * n_divs).astype(int)
y = (y / height * n_divs).astype(int)

Code Snippets

it = product(xrange(n_divs), xrange(n_divs), xrange(cellx), xrange(celly))
for m, n, i, j in it:
    # grad value
    grad = magnit[m * cellx + i, n * celly + j][0]
    # normalized grad value
    norm_grad = grad / img_area
    # Orientation Angle
    angle = angles[m*cellx + i, n*celly+j][0]
    # (-pi,pi) to (0, 2*pi)
    if angle < 0:
        angle += 2 * pi
    nth_bin = floor(float(angle/BIN_RANGE))
    HOG[((m * n_divs + n) * n_bins + int(nth_bin))] += norm_grad

bins = (angles[...,0] % (2 * pi) / BIN_RANGE).astype(int)
x, y = np.mgrid[:width, :height]
x = x * n_divs // width
y = y * n_divs // height
labels = (x * n_divs + y) * n_bins + bins
index = np.arange(n_HOG)
HOG = scipy.ndimage.measurements.sum(magnit[...,0], labels, index)
return HOG / img_area

x = x * n_divs // width
y = y * n_divs // height

x = (x / width * n_divs).astype(int)
y = (y / height * n_divs).astype(int)

Context

StackExchange Code Review Q#42763, answer score: 9

Revisions (0)

No revisions yet.