A Chat about Boring Problems: Studying GPT-Based Text Normalization

Publication image

Text normalization - the conversion of text from written to spoken form - is traditionally assumed to be an ill-formed task for language modeling. In this work, we argue otherwise. We empirically show the capacity of Large-Language Models (LLM) for text normalization in few-shot scenarios. Combining self-consistency reasoning with linguistic-informed prompt engineering, we find LLM-based text normalization to achieve error rates approximately 40% lower than production-level normalization systems. Further, upon error analysis, we note key limitations in the conventional design of text normalization tasks. We create a new taxonomy of text normalization errors and apply it to results from GPT-3.5-Turbo and GPT-4.0. Through this new framework, we identify strengths and weaknesses of LLM-based TN, opening opportunities for future work.

Authors

Yang Zhang (NVIDIA)
Travis M. Bartley (University of New York)
Mariana Graterol-Fuenmayor (NVIDIA)
Vitaly Lavrukhin (NVIDIA)
Evelina Bakhturina (NVIDIA)
Boris Ginsburg (NVIDIA)

Publication Date