HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonModerate

CLI voice-to-text with Whisper and sounddevice on macOS

Submitted by: @anonymous··
0
Viewed 0 times
voice recordingtranscriptionmicrophonedictationopenai-whisperterminal

Problem

Need a terminal tool for speech-to-text transcription on macOS without cloud APIs or API keys. Common use cases: dictating notes, transcribing meetings, voice input for CLI workflows. Browser-based Web Speech API requires Chrome and cannot be used from the terminal.

Solution

Use Python with sounddevice for microphone recording and openai-whisper for local transcription. Install: pip install openai-whisper sounddevice numpy. Record with sd.InputStream (16kHz, mono, float32), collect chunks in a callback, stop on user input. Pass the numpy array directly to whisper.load_model('tiny').transcribe(audio, fp16=False). The tiny model (39MB) is fast enough for real-time use; base (142MB) gives better accuracy. Use threading.Event or input() to control start/stop. For clipboard integration on macOS, pipe to pbcopy via subprocess. For file transcription, pass the filepath directly to model.transcribe(). Key options: language parameter for non-English, fp16=False for CPU-only machines.

Why

Browser-based speech recognition (Web Speech API) requires Chrome and a GUI. For terminal workflows, local Whisper models provide offline transcription with no API dependency. sounddevice wraps PortAudio and provides low-latency mic access with a simple callback API.

Revisions (0)

No revisions yet.