patterncppMinor
Optimization of YUV422 to RGB
Viewed 0 times
optimizationrgbyuv422
Problem
I have a stream of YUV422 encoded images coming in from a camera using Direct Show. I then run this byte array through a conversion method and return the RGB array to C# to display in a WPF app.
My problem is that running the application for a 10 minute period and profiling it, the CPU usage is too high for my usage, and it spends around 70-80% of the time in the YUV422 to RGB conversion method.
The method works really well but is just too slow.
Are there any optimizations I am missing here? (I am mainly a C# programmer so I can easily miss C++ optimizations)
*nb this method is in a C++ dll that is "PInvoked" from C# - the reason for C++ is that a similar method takes twice as long in C# than C++
My problem is that running the application for a 10 minute period and profiling it, the CPU usage is too high for my usage, and it spends around 70-80% of the time in the YUV422 to RGB conversion method.
The method works really well but is just too slow.
Are there any optimizations I am missing here? (I am mainly a C# programmer so I can easily miss C++ optimizations)
*nb this method is in a C++ dll that is "PInvoked" from C# - the reason for C++ is that a similar method takes twice as long in C# than C++
const int STANDARD_SIZE = 1036800;
// Clamp out of range values
#define CLAMP(t) (((t)>255)?255:(((t)>8)
#define GET_G_FROM_YUV(y, u, v) ((298*y-100*u-208*v+128)>>8)
#define GET_B_FROM_YUV(y, u, v) ((298*y+516*u+128)>>8)
BOOL __stdcall YUV422toRGB888(unsigned char *d, unsigned char *s)
{
for (unsigned int i = 0; i < STANDARD_SIZE; ++i)
{
int y0 = *s++ - 16;
int u0 = *s++ - 128;
int y2 = *s++ - 16;
int v = *s++ - 128;
// BGR
*d++ = CLAMP(GET_B_FROM_YUV(y0, u0, v));
*d++ = CLAMP(GET_G_FROM_YUV(y0, u0, v));
*d++ = CLAMP(GET_R_FROM_YUV(y0, u0, v));
// BGR
*d++ = CLAMP(GET_B_FROM_YUV(y2, u0, v));
*d++ = CLAMP(GET_G_FROM_YUV(y2, u0, v));
*d++ = CLAMP(GET_R_FROM_YUV(y2, u0, v));
}
return true;
}Solution
Be mindful of endianess, and also, some systems are UYUV instead of YUVU.
Sometimes it just takes some trial and error if your docs aren't very good :-(. For myself, the following ordering of YUYV worked great with the rest of the code:
Sometimes it just takes some trial and error if your docs aren't very good :-(. For myself, the following ordering of YUYV worked great with the rest of the code:
int u0 = *yuv_in++ - 128;
int y0 = *yuv_in++ - 16;
int v = *yuv_in++ - 128;
int y2 = *yuv_in++ - 16;Code Snippets
int u0 = *yuv_in++ - 128;
int y0 = *yuv_in++ - 16;
int v = *yuv_in++ - 128;
int y2 = *yuv_in++ - 16;Context
StackExchange Code Review Q#160396, answer score: 2
Revisions (0)
No revisions yet.