patterncMinor

Radix-2 FFT in C

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

codereview c stackoverflow beginner signal-processing performance

radixfftstackoverflow

Problem

I'm a beginner in C programming. I am current trying to work on a project requiring 1024-point FFT implementation using radix-2, Decimation-in-frequency. I attach the FFT function C code. How can I increase the performance by modifying the C code?

```
#include "i_cmplx.h" / definition of the complex type /
#include "twiddle1024.h" / quantised and scaled Twiddle factors /
#define LL 1024 / Maximum length of FFT /

/ fft radix-2 funtion using Decimation In Frequency /

//#pragma CODE_SECTION(fft, "mycode"); // Makes the program run from internal memory

void fft(COMPLEX Y, int N) / FFT(input sample array, # of points) */
{
int temp1R, temp1I, temp2R,temp2I; / 32 bits temporary storage for /
/ intermediate results /
short tempR, tempI, c, s; / 16 bits temporary storages /
/ variables /
int TwFStep, / Step between twiddle factors /
TwFIndex, / Index of twiddle factors /
BLStep, / Step for incrementing butterfly index /
BLdiff, / Difference between upper and lower butterfly legs /
upperIdx,
lowerIdx, / upper and lower indexes of buterfly leg /
i, j, k; / loop control variables /

BLdiff=N;
TwFStep=1;
for(k=N;k>1;k=(k>>1)) / Do Log(base 2)(N) Stages /
{
BLStep=BLdiff;
BLdiff=BLdiff>>1;
TwFIndex=0;
for(j=0;j>1;
temp2R = (Y[upperIdx].real + Y[lowerIdx].real)>>1;
Y[upperIdx].real = (short) temp2R;
temp1I = (Y[upperIdx].imag - Y[lowerIdx].imag)>>1;
temp2I = (Y[upperIdx].imag + Y[lowerIdx].imag)>>1;
Y[upperIdx].imag = (short) temp2I;
temp2R = (ctemp1R - stemp1I)>>15;
Y[lowerIdx].real = (short) temp2R;
temp2I = (ctemp1I + stemp1R)>>15;
Y[lowerIdx

Solution

Answer for the resequencing part of your code

This function worked like magic for me:

void bit_reverse_reorder(complex *Y, int N)
{
   unsigned i,j;
   for (i = 0, j = 0; i < N; i++) {
   if (i < j) 
   {
     tempR=Y[j].real;
     tempI=Y[j].imag;
     Y[j].real=Y[i].real;
     Y[j].imag=Y[i].imag;
     Y[i].real=tempR;
     Y[i].imag=tempI;

   }
   unsigned bit = ~i & (i + 1);

   unsigned rev = (N / 2) / bit;

   j ^= (N - 1) & ~(rev - 1);
  }
}

About the compiler:

If I was you I'll use the GCC compiler preferably version 5.x with the highest optimization levels -O5 and try to use the -ffast-math it has its effect on arithmetic operations.

Code Snippets

void bit_reverse_reorder(complex *Y, int N)
{
   unsigned i,j;
   for (i = 0, j = 0; i < N; i++) {
   if (i < j) 
   {
     tempR=Y[j].real;
     tempI=Y[j].imag;
     Y[j].real=Y[i].real;
     Y[j].imag=Y[i].imag;
     Y[i].real=tempR;
     Y[i].imag=tempI;

   }
   unsigned bit = ~i & (i + 1);

   unsigned rev = (N / 2) / bit;

   j ^= (N - 1) & ~(rev - 1);
  }
}

Context

StackExchange Code Review Q#158449, answer score: 2

Revisions (0)

No revisions yet.