NVIDIA Research Taiwan
NVIDIA Research Taiwan
Home
News
Members
Research
Publications
Contact
Light
Dark
Automatic
Object and Action Hallucinations
Mitigating Object and Action Hallucinations in Multimodal LLMs via Self-Augmented Contrastive Alignment
Recent advancement in multimodal LLMs (MLLMs) has demonstrated their remarkable capability to generate descriptive captions for input videos. However, these models suffer from factual inaccuracies in the generated descriptions, causing severe …
Cite
×