← All projects

Easy Transcription

Stable service · 2026

Self-hosted speech-to-structured-notes — browser recording, live WebSocket ASR, and LLM post-structuring into summaries and action items.

faster-whisper large-v3 (CUDA)live streaming ~8s chunks3 capture modesPostgreSQL job pipeline

Open a browser tab, talk, get structured notes. Easy Transcription wraps GPU-accelerated faster-whisper (large-v3) in a clean web app with three capture modes — file upload, in-browser recording, and live WebSocket streaming that transcribes as you speak — backed by a PostgreSQL job pipeline with Alembic-versioned schema.

An optional LLM pass turns raw transcripts into paragraphed text, summaries, and action items; raw text is always the default, structure is opt-in per job. GPU access goes through the platform's timeshare lease API, so a long transcription can be paused and resumed when higher-priority work claims the card — a real test of cooperative preemption, passed.

It has quietly become infrastructure: the in-car voice assistant treats it as its ASR dependency, which is exactly the "paused ≠ stopped" service-maturity model the platform is built around.

Status & limits — stable daily-driver; the roadmap item is a telemetry layer (structuring usage rates, output quality scoring) rather than features.

Stack

PythonFastAPIfaster-whisperWebSocketsPostgreSQLWeb Audio API

← All projects