patterncppMinor
Computing the total different between a pixel and its neighborhood
Viewed 0 times
totalthepixeldifferentitsbetweenandneighborhoodcomputing
Problem
I have C++ code which written for mex in MATLAB. It used to compute the total different between a pixel and its neighborhood (8 neighborhood in my code). The code ran and gave me an expected result. However, I think that we can make my code more faster (because of a lot of
```
#include "mex.h"
#include
#include
#define img(i,j) img[(i) + (j)*M]
void mexFunction( int nlhs, mxArray plhs[], int nrhs, const mxArray prhs[] )
{
double img, G;
int N, M, class_count;
int ndim=3;
double beta, energy;
N = (int) mxGetN(prhs[0]); //cols
M = (int) mxGetM(prhs[0]); //rows
img = (double *)mxGetData(prhs[0]);
beta = (double )*mxGetPr(prhs[1]);
class_count = (int )*mxGetPr(prhs[2]);
/ Initial zeros matrix/
int dims[3] = {M-2, N-2, class_count};
plhs[0] = mxCreateNumericArray(ndim, dims, mxDOUBLE_CLASS, mxREAL);
G = (double *) mxGetPr(plhs[0]);
/////////////Main///////////////
for (int label=2; label <= class_count+1; label++) {
for (int i=1; i < M-1; i++) {
for (int j=1; j < N-1; j++) {
energy = 0;
// North, south, east and west
if (label == img(i-1,j)) energy = energy-beta;
else energy = energy+beta;
if (label == img(i,j+1)) energy = energy-beta;
else energy = energy+beta;
if (label == img(i+1,j)) energy = energy-beta;
else energy = energy+beta;
if (label == img(i,j-1)) energy = energy-beta;
else energy = energy+beta;
if (label == img(i-1,j-1)) energy = energy-beta;
else energy = energy+beta;
if (label == img(i-1,j+1)) energy = energy-beta;
else energy = energy+beta;
if (label == img(i+1,j+1))
if statement). I am very happy if someone can help me to speed up it.```
#include "mex.h"
#include
#include
#define img(i,j) img[(i) + (j)*M]
void mexFunction( int nlhs, mxArray plhs[], int nrhs, const mxArray prhs[] )
{
double img, G;
int N, M, class_count;
int ndim=3;
double beta, energy;
N = (int) mxGetN(prhs[0]); //cols
M = (int) mxGetM(prhs[0]); //rows
img = (double *)mxGetData(prhs[0]);
beta = (double )*mxGetPr(prhs[1]);
class_count = (int )*mxGetPr(prhs[2]);
/ Initial zeros matrix/
int dims[3] = {M-2, N-2, class_count};
plhs[0] = mxCreateNumericArray(ndim, dims, mxDOUBLE_CLASS, mxREAL);
G = (double *) mxGetPr(plhs[0]);
/////////////Main///////////////
for (int label=2; label <= class_count+1; label++) {
for (int i=1; i < M-1; i++) {
for (int j=1; j < N-1; j++) {
energy = 0;
// North, south, east and west
if (label == img(i-1,j)) energy = energy-beta;
else energy = energy+beta;
if (label == img(i,j+1)) energy = energy-beta;
else energy = energy+beta;
if (label == img(i+1,j)) energy = energy-beta;
else energy = energy+beta;
if (label == img(i,j-1)) energy = energy-beta;
else energy = energy+beta;
if (label == img(i-1,j-1)) energy = energy-beta;
else energy = energy+beta;
if (label == img(i-1,j+1)) energy = energy-beta;
else energy = energy+beta;
if (label == img(i+1,j+1))
Solution
Here are some things that may help you improve your code.
Measure your code
Since we don't have your data or your computer or your compiler, we're a little bit hampered in trying to identify things that will speed up your code. However, you have access to all of those things, so rather than guess that it's the cause of the
Think about "locality of reference"
With modern computers, cacheing is often a factor in determining performance. Generally, data items that are "close together" (that is, items that all fit in the cache at once) will be quicker to access than those that are "far apart" (that is, items that do not all fit in the cache at once). For this reason, we can think about restructuring the algorithm to try to work on as small a memory footprint as possible. In this case it's possible you can gain performance by simply rearranging the nested loops. Right now, it looks like this:
But I think I'd try arranging things like this instead:
This way, the input data is accessed in a somewhat more sequential manner. The output data is not, but because it's only accessed once per loop, I'm guessing that it makes less difference.
Simplify your code
One of the things your code does is to either add or subtract an energy value if a particular value is matched. Because integer math is often faster than floating point, you might benefit from restructuring your code. Here's one way to do that. Change the interior of the loop to look like this instead:
Use
Many of these values are set once and then not changed. The compiler may be able to find better optimizations if you explicitly tell it which values are
Don't initialize large data items pointlessly
It appears that every member of
Measure your code
Since we don't have your data or your computer or your compiler, we're a little bit hampered in trying to identify things that will speed up your code. However, you have access to all of those things, so rather than guess that it's the cause of the
if-then constructs or anything else, the prudent thing to do would be to measure the code and find out (e.g. with a profiler) where your code is spending most of its time. Armed with that data, you can begin to effectively target the real bottlenecks.Think about "locality of reference"
With modern computers, cacheing is often a factor in determining performance. Generally, data items that are "close together" (that is, items that all fit in the cache at once) will be quicker to access than those that are "far apart" (that is, items that do not all fit in the cache at once). For this reason, we can think about restructuring the algorithm to try to work on as small a memory footprint as possible. In this case it's possible you can gain performance by simply rearranging the nested loops. Right now, it looks like this:
for (int label=2; label <= class_count+1; label++) {
for (int i=1; i < M-1; i++) {
for (int j=1; j < N-1; j++) {But I think I'd try arranging things like this instead:
for (int j=1; j < N-1; j++) {
for (int i=1; i < M-1; i++) {
for (int label=2; label <= class_count+1; label++) {This way, the input data is accessed in a somewhat more sequential manner. The output data is not, but because it's only accessed once per loop, I'm guessing that it makes less difference.
Simplify your code
One of the things your code does is to either add or subtract an energy value if a particular value is matched. Because integer math is often faster than floating point, you might benefit from restructuring your code. Here's one way to do that. Change the interior of the loop to look like this instead:
int energycount = 8;
if (label == img(i-1,j-1)) energycount -= 2;
if (label == img(i ,j-1)) energycount -= 2;
if (label == img(i+1,j-1)) energycount -= 2;
if (label == img(i-1,j )) energycount -= 2;
if (label == img(i+1,j )) energycount -= 2;
if (label == img(i-1,j+1)) energycount -= 2;
if (label == img(i ,j+1)) energycount -= 2;
if (label == img(i+1,j+1)) energycount -= 2;
G[(i-1) + (j-1)* (M-2) + (label-2) * (N-2) * (M-2)]=beta*energycount;Use
const where practicalMany of these values are set once and then not changed. The compiler may be able to find better optimizations if you explicitly tell it which values are
const:const int ndim = 3;
const int N = (int) mxGetN(prhs[0]); //cols
const int M = (int) mxGetM(prhs[0]); //rows
const double *img = (double *)mxGetData(prhs[0]);
const double beta = (double )*mxGetPr(prhs[1]);
const int class_count = (int )*mxGetPr(prhs[2]);
/* Initial zeros matrix*/
const int dims[3] = {M-2, N-2, class_count};
plhs[0] = mxCreateNumericArray(ndim, dims, mxDOUBLE_CLASS, mxREAL);
double *G = (double *) mxGetPr(plhs[0]);Don't initialize large data items pointlessly
It appears that every member of
G is initialized by the loop, so there's no real need to initialize it before that. For that reason, I'd recommend using mxCreateUninitNumericArray to create an unitialized array rather than mxCreateNumericArray which initializes all values to 0.Code Snippets
for (int label=2; label <= class_count+1; label++) {
for (int i=1; i < M-1; i++) {
for (int j=1; j < N-1; j++) {for (int j=1; j < N-1; j++) {
for (int i=1; i < M-1; i++) {
for (int label=2; label <= class_count+1; label++) {int energycount = 8;
if (label == img(i-1,j-1)) energycount -= 2;
if (label == img(i ,j-1)) energycount -= 2;
if (label == img(i+1,j-1)) energycount -= 2;
if (label == img(i-1,j )) energycount -= 2;
if (label == img(i+1,j )) energycount -= 2;
if (label == img(i-1,j+1)) energycount -= 2;
if (label == img(i ,j+1)) energycount -= 2;
if (label == img(i+1,j+1)) energycount -= 2;
G[(i-1) + (j-1)* (M-2) + (label-2) * (N-2) * (M-2)]=beta*energycount;const int ndim = 3;
const int N = (int) mxGetN(prhs[0]); //cols
const int M = (int) mxGetM(prhs[0]); //rows
const double *img = (double *)mxGetData(prhs[0]);
const double beta = (double )*mxGetPr(prhs[1]);
const int class_count = (int )*mxGetPr(prhs[2]);
/* Initial zeros matrix*/
const int dims[3] = {M-2, N-2, class_count};
plhs[0] = mxCreateNumericArray(ndim, dims, mxDOUBLE_CLASS, mxREAL);
double *G = (double *) mxGetPr(plhs[0]);Context
StackExchange Code Review Q#126578, answer score: 5
Revisions (0)
No revisions yet.