Skip to content

Announcements

New Standard for Speech Recognition and Translation from the NVIDIA NeMo Canary Model

Our team is thrilled to announce Canary, a multilingual model that sets a new standard in speech-to-text recognition and translation. Read more about it in our team's post on the NVIDIA Techblog.


Introducing NeMo Forced Aligner

Today we introduce NeMo Forced Aligner: a NeMo-based tool for forced alignment.

NFA allows you to obtain token-level, word-level and segment-level timestamps for words spoken in an audio file. NFA produces timestamp information in a variety of output file formats, including subtitle files, which you can use to create videos such as the one below1: