HiveBrain v1.2.0
Get Started
← Back to all entries
patterncMinor

Detect optimized

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
optimizeddetectstackoverflow

Problem

I've been trying to optimize this piece of code:

void detect_optimized(int width, int height, int threshold)
{
  int x, y, z;
  int tmp;

  for (y = 1; y 255)
          tmp = 255;
        if (tmp<threshold)
          tmp = 0;
        c[x][y][z] = 255-tmp;
      }
  return;
}


So far I've tried "Blocking" and a few other things, but I can't seem to get it to run any faster.

Blocking resulted in:

for(yy = 1; yy255)
                          tmp = 255;
                         if (tmp<threshold)
                          tmp = 0;
                         c[x][y][z] = 255-tmp;
                      }}}}}


Which ran at the same speed as the original program.

Any suggestions would be great.

mask_function cannot be changed, but here is its code:

int mask_product(int m[3][3], byte bitmap[MAX_ROW][MAX_COL][NUM_COLORS], int x, int y, int z)
{
  int tmp[9];
  int i, sum;

// ADDED THIS LINE (sum = 0) TO FIX THE BUG
  sum = 0;

  tmp[0] = m[0][0]*bitmap[x-1][y-1][z];
  tmp[1] = m[1][0]*bitmap[x][y-1][z];
  tmp[2] = m[2][0]*bitmap[x+1][y-1][z];
  tmp[3] = m[0][1]*bitmap[x-1][y][z];
  tmp[4] = m[1][1]*bitmap[x][y][z];
  tmp[5] = m[2][1]*bitmap[x+1][y][z];
  tmp[6] = m[0][2]*bitmap[x-1][y+1][z];
  tmp[7] = m[1][2]*bitmap[x][y+1][z];
  tmp[8] = m[2][2]*bitmap[x+1][y+1][z];
  for (i=0; i<9; i++)
    sum = sum + tmp[i];
  return sum;
}

Solution

I am afraid I can't do much about the performance but here are some other tips:

  • The z loop bound is not checked against NUM_COLORS (which I believe is the size of the array for this index).



  • z should be renamed to something like colorIndex.



  • Using prefix increments (++x) might give a speedup



-
Precalculating loop bounds to avoid repeated recalculation might give another speedup: maxY = width - 1; and then for (y = 1; y

-
tmp should have a better name (what does it actually contain/represent?)

-
place the
ifs into an own function (or macro) clamp (which is a widely used name and maybe has a standard implementation) and use it to make the code cleaner: tmp = clamp(tmp, 0, 255);

-
do not use global variables instead of function parameters, they introduce hard to debug side effects

  • Although you cannot change the mask function I would recommend avoiding the tmp` array and instead directly summing up into sum.



You could gain some speedup by replacing the inner loop over the colors by MMX functionality which allows to do the same calculation on all colors at once. However, I am no expert in this and it likely requires inline assembly.

Context

StackExchange Code Review Q#47600, answer score: 5

Revisions (0)

No revisions yet.