Speaker Emotion

Investigating Safety Vulnerabilities of Large Audio-Language Models Under Speaker Emotional Variations

Large audio-language models (LALMs) extend text-based LLMs with auditory understanding, offering new opportunities for multimodal applications. While their perception, reasoning, and task performance have been widely studied, their safety alignment …