HiveBrain v1.2.0
Get Started
← Back to all entries
patterncMajor

Recording Audio Continuously in C

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
recordingaudiocontinuously

Problem

As an ongoing little side project of mine, I've been working on recording audio in c. You can the code's progression by looking at past versions (V1.0, V2.0).

I've found that my past versions were too limited. They required a certain time frame for which they would record, and then stop. This isn't very practical for many real-world applications, so I rewrote everything so the computer would always be listening, and only record necessary data.

Here is what I would like reviewed:

-
Memory consumption: Obviously a huge potential problem is lots of space being consumed in recording and saving the audio to file. Am I wasting space at all? Every little bit counts. Note: file storage container must be FLAC.

-
Speed: I have to be processing the data in real time. I can't hang for anything or I might lose precious audio data. Is there anywhere I could speed up processing?

-
Syntax/Styling: How does my code look? What about it could make it look better? Anything I'm doing wrong syntax-wise?

Feel free to review other stuff as well, I would just like reviews to be focused on the above. Keep in mind I've been away from the C language for a bit now, so I may have forgotten some of the more simple things ;)

main.c:

```
#include
#include
#include
#include
#include
#include
#include

#define FRAMES_PER_BUFFER 1024

typedef struct
{
uint16_t formatType;
uint8_t numberOfChannels;
uint32_t sampleRate;
size_t size;
float *recordedSamples;
} AudioData;

typedef struct
{
float *snippet;
size_t size;
} AudioSnippet;

AudioData initAudioData(uint32_t sampleRate, uint16_t channels, int type)
{
AudioData data;
data.formatType = type;
data.numberOfChannels = channels;
data.sampleRate = sampleRate;
data.size = 0;
data.recordedSamples = NULL;
return data;
}

float avg(float *data, size_t length)
{
float sum = 0;
for (size_t i = 0; i numberOfChannels,
.samplerate = data->sampleRate,
.fo

Solution

Indeed you must have been working on this project for a while, since I answered a question on SO related to one of your old versions a while back. At least some of the feedback I had is still pertinent even now.

Code and style comments

-

#include 
#include 
#include 
#include 
#include 
#include 
#include 


The problem starts already here. You make heavy use of uint32_t and other such types, without #include . I assume this works because you're probably still on Mac OS X, and this would also explain why your GCC doesn't complain about C99 loop initial declarations without -std=gnu99.

-
As mentioned previously, you make unnecessary use of pointer arithmetic, which makes the code hard to read. I definitely prefer @nhgrif's version of your average; I find sum += fabs(data[i]); eminently easier on the eye.

-
The avg() function has at least a few other issues. First, if you want to compute when someone is talking, I wouldn't do a straight average, even with a gimmick like fabs(). I'd do an RMS (Root-Mean-Square) instead and then subtract the regular average (without fabs()) from the RMS value; this computes the audio "power" in the buffer while ignoring any DC bias (non-zero average), which your code isn't robust against. This is directly connected to the number of decibels in your sample by the same power law used to "compand" the input from the microphone (It could be A-law or u-law, or maybe something else).

Second, and related; At 1024 frames and a sample rate of 44100Hz, your buffers cover a period of just 23 milliseconds; In other words, only one complete cycle of a sound with frequency 43 Hz. I don't know how low @rob0t's voice goes, but you may want to increase that amount if you will be dealing with low-frequency content. When computing the RMS and DC (average), you can only get an accurate result if you capture precisely an integer number of cycles for the frequencies of interest to you; But if your buffer spans enough cycles of all frequencies of interest to you, any inaccuracy introduced by partial cycles is drowned out by the complete ones in the average.

Thirdly, your buffer has 1024 frames, but the accumulator is float. I suspect you don't care about precision much here, but you may be interested in knowing that because IEEE floating-point arithmetic isn't exactly like real math, the order in which floating-point values are added may make a small difference. In particular, if you were to add up 1024 identical values, you'll loose about 10 out of 24 bits of precision on your last accumulations. This is because the accumulator will have grown to 2^10 times the size of the accumulands, and additions into the accumulator will only use the 14 MSBs while rounding off the 10 LSBs. I'd use double here unless it were proven to me that avg() is a performance bottleneck and that vectorization has already done all it can.

Fourthly and lastly,


return (float) sum / length;

is entirely redundant: sum is already a float, and you're casting it to float again. If you wanted to carry out this division using floating-point math, this is already guaranteed; In C, floating-point values outrank any integer in the implicit conversions. Remember, casts don't apply to the operation (/), but to their operand (here, sum).

-
storeFLAC() has no logging in case of errors, so I have no immediate explanations as to why your code, built on Linux, succeeds only in creating zero-length files. The error code it returns is completely ignored.

SNDFILE *outfile = sf_open(fileName, SFM_WRITE, &sfinfo);
if (!outfile) return -1;

// Write the entire buffer to the file


And due to lack of braces around early return statements, it's harder for reviewers to add their own debugging code, because it's that much more of a chore to add braces on top of a debug printf(). It's nice to be considerate to your reviewers, including your future self.

-
main() is in sore need of refactoring, specifically extraction of a few functions for each major step. The code you wrote does not "speak" to me; I must invest some effort to understand what it does. At least it is approximately a screenful, which is in my opinion a good size for a function.

-
This is gross misuse of goto:

int main(void)
{
  PaError err = paNoError;
  if((err = Pa_Initialize())) goto done;
  const PaDeviceInfo *info = Pa_GetDeviceInfo(Pa_GetDefaultInputDevice());
  AudioData data = initAudioData(44100, info->maxInputChannels, paFloat32);
  AudioSnippet sampleBlock =
  {
      .snippet = NULL,
      .size = FRAMES_PER_BUFFER * sizeof(float) * data.numberOfChannels
  };
  ...

  done:
  free(sampleBlock.snippet);
  Pa_Terminate();
  return err;
}


While I do approve of goto for error handling, you've here committed the sin of jumping over the initialization of local variables, then bothering to use them. Nothing guarantees that sampleBlock.snippet is initialized when you attempt to free() it: You h

Code Snippets

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <string.h>
#include <time.h>
#include <portaudio.h>
#include <sndfile.h>
SNDFILE *outfile = sf_open(fileName, SFM_WRITE, &sfinfo);
if (!outfile) return -1;

// Write the entire buffer to the file
int main(void)
{
  PaError err = paNoError;
  if((err = Pa_Initialize())) goto done;
  const PaDeviceInfo *info = Pa_GetDeviceInfo(Pa_GetDefaultInputDevice());
  AudioData data = initAudioData(44100, info->maxInputChannels, paFloat32);
  AudioSnippet sampleBlock =
  {
      .snippet = NULL,
      .size = FRAMES_PER_BUFFER * sizeof(float) * data.numberOfChannels
  };
  ...

  done:
  free(sampleBlock.snippet);
  Pa_Terminate();
  return err;
}
portaudio.c: In function ‘main’:
portaudio.c:135:6: warning: ‘sampleBlock.snippet’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  free(sampleBlock.snippet);
      ^
if((err = Pa_OpenStream(&stream, &inputParameters, NULL, data.sampleRate, FRAMES_PER_BUFFER, paClipOff, NULL, NULL))) goto done;

Context

StackExchange Code Review Q#84536, answer score: 30

Revisions (0)

No revisions yet.