WhisperUI

Local audio transcription. No API key. No cloud. No data leaving your machine.

The Problem

You have a sensitive meeting recording. A client interview. A personal voice memo.

You need it transcribed.

The options:

Cloud services: Require API keys, cost money per minute, send your audio to someone else’s servers
Manual transcription: Takes hours, error-prone, tedious
Do nothing: The recording sits there, unsearchable, unusable

For sensitive content, cloud services are unacceptable. Your data should stay on your machine.

The Solution

WhisperUI is a local audio transcription tool that:

Runs OpenAI’s Whisper model entirely on your hardware
Requires no API key
Never sends your audio to the cloud
Works offline
Exports to text, SRT, or VTT formats

The Pipeline

WhisperUI uses a four-stage pipeline:

Stage 1: Audio Processing

FFmpeg handles audio format conversion
Supports MP3, WAV, M4A, and more
Normalizes audio quality for better transcription

Stage 2: Transcription

OpenAI Whisper model runs locally
Multiple model sizes (tiny, base, small, medium, large)
Trade-off between speed and accuracy

Stage 3: Timestamping

Automatic timestamp generation
Word-level or segment-level precision
Configurable timestamp intervals

Stage 4: Export

Plain text (.txt)
Subtitles (.srt, .vtt)
JSON with word-level timestamps

The Privacy Guarantee

Your audio never leaves your machine.

No API key required
No cloud communication
No data sent to external servers
Works entirely offline

This is for anyone who’s ever hesitated to paste a sensitive meeting transcript into an online service.

Platforms

Windows:

Standalone executable (.exe)
No installation required
Runs on any Windows machine

macOS:

Standalone application (.app)
Drag-and-drop installation
Native macOS experience

Use Cases

Business meetings: Transcribe strategy discussions without sending to cloud
Client interviews: Keep sensitive client data on your machine
Personal recordings: Voice memos, lectures, podcasts
Legal/medical: Transcribe recordings where privacy is required

Technical Details

Model: OpenAI Whisper — state-of-the-art speech recognition model trained on 680,000 hours of multilingual data.

Framework: PyQt6 for the desktop application interface. Cross-platform, native look and feel.

Audio Processing: FFmpeg for robust audio format handling and conversion.

Consulting Angle

WhisperUI demonstrates the value of local-first AI tooling. For organizations with privacy requirements, compliance obligations, or sensitive data, cloud-based AI services are often non-starters. Local models like Whisper enable AI capabilities without the privacy trade-off.

GitHub: WhisperUI

Built with Python, PyQt6, and OpenAI Whisper. Local audio transcription. No cloud. No API key.