WhisperUI
Local audio transcription. No API key. No cloud. No data leaving your machine.
The Problem
You have a sensitive meeting recording. A client interview. A personal voice memo.
You need it transcribed.
The options:
- Cloud services: Require API keys, cost money per minute, send your audio to someone else’s servers
- Manual transcription: Takes hours, error-prone, tedious
- Do nothing: The recording sits there, unsearchable, unusable
For sensitive content, cloud services are unacceptable. Your data should stay on your machine.
The Solution
WhisperUI is a local audio transcription tool that:
- Runs OpenAI’s Whisper model entirely on your hardware
- Requires no API key
- Never sends your audio to the cloud
- Works offline
- Exports to text, SRT, or VTT formats
The Pipeline
WhisperUI uses a four-stage pipeline:
Stage 1: Audio Processing
- FFmpeg handles audio format conversion
- Supports MP3, WAV, M4A, and more
- Normalizes audio quality for better transcription
Stage 2: Transcription
- OpenAI Whisper model runs locally
- Multiple model sizes (tiny, base, small, medium, large)
- Trade-off between speed and accuracy
Stage 3: Timestamping
- Automatic timestamp generation
- Word-level or segment-level precision
- Configurable timestamp intervals
Stage 4: Export
- Plain text (.txt)
- Subtitles (.srt, .vtt)
- JSON with word-level timestamps
The Privacy Guarantee
Your audio never leaves your machine.
- No API key required
- No cloud communication
- No data sent to external servers
- Works entirely offline
This is for anyone who’s ever hesitated to paste a sensitive meeting transcript into an online service.
Platforms
Windows:
- Standalone executable (.exe)
- No installation required
- Runs on any Windows machine
macOS:
- Standalone application (.app)
- Drag-and-drop installation
- Native macOS experience
Use Cases
- Business meetings: Transcribe strategy discussions without sending to cloud
- Client interviews: Keep sensitive client data on your machine
- Personal recordings: Voice memos, lectures, podcasts
- Legal/medical: Transcribe recordings where privacy is required
Technical Details
Model: OpenAI Whisper — state-of-the-art speech recognition model trained on 680,000 hours of multilingual data.
Framework: PyQt6 for the desktop application interface. Cross-platform, native look and feel.
Audio Processing: FFmpeg for robust audio format handling and conversion.
Consulting Angle
WhisperUI demonstrates the value of local-first AI tooling. For organizations with privacy requirements, compliance obligations, or sensitive data, cloud-based AI services are often non-starters. Local models like Whisper enable AI capabilities without the privacy trade-off.
GitHub: WhisperUI
Built with Python, PyQt6, and OpenAI Whisper. Local audio transcription. No cloud. No API key.