1. [Publications](/publications)
2. VILA-M3: Enhancing Vision-Language Models with Medical Expert Knowledge
 
 # VILA-M3: Enhancing Vision-Language Models with Medical Expert Knowledge

  ![Publication image](/sites/default/files/styles/wide/public/default_images/default.jpeg?itok=qUFsuJCP "Publication image")

 ## Authors


[Vishwesh Nath](/person/vishwesh-nath)

[Wenqi Li](/person/wenqi-li)

[Dong Yang](/person/dong-yang)

[Andriy Myronenko](/person/andriy-myronenko)

Mingxin Zheng (NVIDIA)

Yao Lu (NVIDIA)

[Zhijian Liu](/person/zhijian-liu)

[Hongxu Danny Yin](/person/danny-yin)

Yee Man Law (SingHealth)

Stephanie Harmon (NIH)

Benjamin Simon (NIH)

[Greg Heinrich](/person/greg-heinrich)

Stephen Aylward (NVIDIA)

Marc Edgar (NVIDIA)

Michael Zephyr (NVIDIA)

[Pavlo Molchanov](/person/pavlo-molchanov)

Baris Turkbey (NIH)

[Holger Roth](/person/holger-roth)

[Daguang Xu](/person/daguang-xu)

 
 ## Publication Date


Wednesday, June 11, 2025

 
 ## Published in


[CVPR 2025](https://openaccess.thecvf.com/content/CVPR2025/html/Nath_VILA-M3_Enhancing_Vision-Language_Models_with_Medical_Expert_Knowledge_CVPR_2025_paper.html)

 
 ## Research Area


[Medical](/research-area/medical)