gotchaMinor

What is the difference between 'features' and 'descriptors' in computer vision / machine learning?

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

thedescriptorswhatvisionlearningdifferencebetweencomputermachineand

Problem

I've read multiple time sentences similar to

Finally, for standard image classification bag-of-words
features based on SIFT descriptors have been found
critical for high performances. We first compute a
standard SIFT discriptor at regular grid points over
the whole image.

Source: "Multi-class image segmentation using Conditional Random Fields
and Global Classification" by Nils Plath, Marc Toussaint, Shinichi Nakajima.

What is a descriptor? I thought SIFT is an algorithm which operates on images and gives features (vectors in $\mathbb{R}^n$, where $n$ is fixed for a fixed size of images and parameters of the SIFT algorithm)?

Solution

The SIFT descriptor vector is a feature vector. "Descriptor vector" and "feature vector" are synonyms in this context. Most of the descriptions of SIFT I've seen use the phrase "descriptor vector", but occasionally they'll refer to it as a "feature vector" or refer it to as "SIFT features", perhaps to draw upon intuition from machine learning.

SIFT works by analyzing the image, identifying a set of keypoints (a set of points in the image that will be helpful for alignment), and then for each keypoint, it computes a descriptor vector (a feature vector). Then it uses the descriptor vectors for the keypoints in image $I_1$ and the descriptor vectors for the keypoints in image $I_2$ to try to align the two images to each other. The intuition is that if the descriptor vector for a keypoint in image $I_1$ is "similar" to a descriptor vector for a keypoint in image $I_2$, then maybe those two points should be aligned to each other. Here "similarity" is measured by the Euclidean distance between the two descriptor vectors.

Thus, a descriptor vector for a keypoint is a vector, e.g., in $\mathbb{R}^{128}$, chosen so that if the image is translated, scaled, rotated, etc., then the descriptor vector for that point won't be changed much by the transformation.

You can find a reasonable description of SIFT in Wikipedia: https://en.wikipedia.org/wiki/Scale-invariant_feature_transform

Context

StackExchange Computer Science Q#51373, answer score: 2

Revisions (0)

No revisions yet.