A fully offline, multi-speaker transcription pipeline for macOS (no cloud, no API keys, runs on M1/M2/M3 with Metal acceleration)

Posted by No_Weight6617@reddit | LocalLLaMA | View on Reddit | 7 comments

Hey,

I developed VaultASR,a native C++ pipeline that does the entire speech-to-text + speaker diarization stack locally. My major goal has been to effectively utilize the hardware and run end-to-end on the machine locally avoiding any sensitive recordings/data go to cloud

What it does:

Performance on M1:

Stack:

Roadmap: Goal is to support other execution providers (CUDA (NVIDIA), DirectML (Windows), ROCm (AMD))

GitHub: https://github.com/vamshinr/vaultASR

would love the help extending this project to support other execution providers.