gotchatypescriptMajor
Whisper API has 25MB file size limit — chunk long audio before transcribing
Viewed 0 times
openai@4.x
whisperspeech-to-textaudiochunkingffmpeg25mb-limit
Error Messages
Problem
Sending audio files larger than 25MB to the OpenAI Whisper API returns a 413 error. Long recordings (meetings, lectures, podcasts) routinely exceed this limit.
Solution
Use ffmpeg to split audio into chunks under 25MB before sending. Split on silence boundaries when possible to avoid cutting words. Send chunks in order and concatenate the transcriptions. Include a few seconds of overlap between chunks to avoid missing words at split points.
Why
The Whisper API enforces a hard 25MB limit per request. There is no streaming input mode — the full file must be uploaded before transcription begins.
Gotchas
- Bitrate affects file size — convert to 16kHz mono MP3 at 64kbps to minimize size while maintaining accuracy
- Chunk overlap can introduce duplicate words at boundaries — post-process to deduplicate
- The 'language' parameter significantly improves accuracy for non-English audio
Code Snippets
Transcribe audio file with Whisper API
import fs from 'fs';
const transcription = await openai.audio.transcriptions.create({
model: 'whisper-1',
file: fs.createReadStream('/path/to/audio.mp3'),
language: 'en',
response_format: 'verbose_json', // includes word timestamps
});Context
Transcribing long-form audio content with OpenAI Whisper
Revisions (0)
No revisions yet.