patternMinor
Does JPEG compression use a heuristic model similar to MP3?
Viewed 0 times
compressionheuristicmp3doesusesimilarmodeljpeg
Problem
I know that MP3 uses a compression heuristic that allows it to exclude storing frequency data outside a certain range due to the limits of human hearing.
Does JPEG employ some sort of non-linear quantization heuristic that accounts for the limitations of humans to perceive small differences in specific color bands or ranges?
Thank you in advance for any responses.
Does JPEG employ some sort of non-linear quantization heuristic that accounts for the limitations of humans to perceive small differences in specific color bands or ranges?
Thank you in advance for any responses.
Solution
There are several lossy aspects of the JPEG algorithm. The main one is quantization of high-order Fourier coefficients, and a more minor one is downsampling the color components.
JPEG compression of a black and white image (we deal with color images later on) is composed of four main steps:
In addition, the DC component (the average of each block) is encoded differentially (with reference to the previous block). The lossy aspect here is the quantization, specified by a quantization matrix. Higher order coefficients are quantized more heavily, and the coefficients were originally chosen experimentally (though in principle they can be optimized for each image, since they are stored in the JPEG file). The quality parameter mainly affects the quantization matrix.
Color images are encoded by first moving into the YCbCr color space (similar to television's YPbPr), in which Y is the saturation components, and Cb and Cr are the color components. The latter two components are downsampled by a factor of 2 in the horizontal dimension (this is the most common choice, though others are possible), since the human visual system is less sensitive to this kind of data. Each of the color components is then encoded as a black and white image, using two different quantization matrix, one for the Y component, the other for the Cb and Cr components. The latter matrix quantizes more heavily.
JPEG compression of a black and white image (we deal with color images later on) is composed of four main steps:
- Divide the image into $8 \times 8$ blocks.
- Perform a two-dimensional Fourier transform on each block (specifically, apply the discrete cosine transform).
- Quantize each of the Fourier coefficients.
- Encode the Fourier coefficients using Huffman coding.
In addition, the DC component (the average of each block) is encoded differentially (with reference to the previous block). The lossy aspect here is the quantization, specified by a quantization matrix. Higher order coefficients are quantized more heavily, and the coefficients were originally chosen experimentally (though in principle they can be optimized for each image, since they are stored in the JPEG file). The quality parameter mainly affects the quantization matrix.
Color images are encoded by first moving into the YCbCr color space (similar to television's YPbPr), in which Y is the saturation components, and Cb and Cr are the color components. The latter two components are downsampled by a factor of 2 in the horizontal dimension (this is the most common choice, though others are possible), since the human visual system is less sensitive to this kind of data. Each of the color components is then encoded as a black and white image, using two different quantization matrix, one for the Y component, the other for the Cb and Cr components. The latter matrix quantizes more heavily.
Context
StackExchange Computer Science Q#54881, answer score: 2
Revisions (0)
No revisions yet.