Speech Transcription with parakeet-tdt-0.6b-v3 🦜

This demo showcases parakeet-tdt-0.6b-v3, a 600-million-parameter multilingual model designed for high-quality speech recognition.

Key Features:

  • Multilingual transcription across 25 European languages
  • Automatic punctuation and capitalization
  • Accurate word-level timestamps (click on a segment in the table below to play it!)
  • Long audio transcription: up to 24 minutes with full attention (A100 80GB) or up to 3 hours with local attention

Supported Languages: bg, hr, cs, da, nl, en, et, fi, fr, de, el, hu, it, lv, lt, mt, pl, pt, ro, sk, sl, es, sv, ru, uk

This model is available for commercial and non-commercial use (CC BY 4.0).

🎙️ Learn more about the Model | 📄 Fast Conformer paper | 📚 TDT paper | 🧑‍💻 NeMo Repository

Example Audio Files (Click to Load)

Transcription Results (Click row to play segment)

Transcription Segments