Research Labs
All Research Labs
3D Deep Learning
Applied Research
Autonomous Vehicles
Deep Imagination
Publications
AI Playground
New and Featured
AI Art Gallery
NGC Demos
Research Areas
AI & Machine Learning
3D Deep Learning
Computer Vision
Robotics
All Areas
Careers
Academic Collaborations
Government Collaborations
Graduate Fellowship
Internships
Research Openings
Research Scientists
Meet the Team
Licensing
Skip to main content
Artificial Intelligence Computing Leadership from NVIDIA
Login
Research Labs
All Research Labs
3D Deep Learning
Applied Research
Autonomous Vehicles
Deep Imagination
Publications
AI Playground
New and Featured
AI Art Gallery
NGC Demos
Research Areas
AI & Machine Learning
3D Deep Learning
Computer Vision
Robotics
All Areas
Careers
Academic Collaborations
Government Collaborations
Graduate Fellowship
Internships
Research Openings
Research Scientists
Meet the Team
Licensing
Search
Search
Enter the terms you wish to search for.
Publications
Our publications provide insight into some of our leading-edge research.
Filters
Search
Apply
Filters
Filters
Publication Year
2025
(2)
2024
(9)
2023
(8)
2022
(8)
2021
(11)
2020
(5)
2019
(3)
Facet Publication Year
Research Areas
Speech Processing
(11)
Natural Language Processing
(2)
Events
No Results Available
11 results found
Speech Processing
Clear all
2021
Speech Processing
2021
Mixer-TTS: non-autoregressive, fast and compact text-to-speech model conditioned on language model embeddings
Oktai Tatanov, Stanislav Beliaev, Boris Ginsburg
Mixer-TTS: non-autoregressive, fast and compact text-to-speech model conditioned on language model embeddings
Oktai Tatanov, Stanislav Beliaev, Boris Ginsburg
A Unified Transformer-based Framework for Duplex Text Normalization
Tuan Manh Lai, Yang Zhang, Evelina Bakhturina , Boris Ginsburg, Heng Ji
CarneliNet: Neural Mixture Model for Automatic Speech Recognition
Aleksei Kalinov, Somshubra Majumdar, Jagadeesh Balam, Boris Ginsburg
TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction
Stanislav Beliaev, Boris Ginsburg
TalkNet: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis
Stanislav Beliaev, Boris Ginsburg
NeMo Inverse Text Normalization: From Development To Production
Yang Zhang, Evelina Bakhturina, Kyle Gorman, Boris Ginsburg
A Toolbox for Construction and Analysis of Speech Datasets
Evelina Bakhturina, Vitaly Lavrukhin, Boris Ginsburg
SPGISpeech: 5,000 Hours of Transcribed Financial Audio for Fully Formatted End-to-End Speech Recognition
Patrick K. O’Neill, Vitaly Lavrukhin, Somshubra Majumdar, Vahid Noroozi, Yuekai Zhang, Oleksii Kuchaiev, Jagadeesh Balam, Yuliya Dovzhenko, Keenan Freyberg, Michael D. Shulman, Boris Ginsburg, Shinji Watanabe, Georg Kucsko
Citrinet: Closing the Gap between Non-Autoregressive and Autoregressive End-to-End Models for Automatic Speech Recognition
Somshubra Majumdar, Jagadeesh Balam, Oleksii Hrinchuk, Vitaly Lavrukhin, Vahid Noroozi, Boris Ginsburg
Hi-Fi Multi-Speaker English TTS Dataset
Evelina Bakhturina, Vitaly Lavrukhin, Boris Ginsburg, Yang Zhang