Ryo Hachiuma is a Research Scientist at NVIDIA Research Taiwan, working on Multi-Modal AI. He received his Ph.D. degree from Keio University, advised by Prof. Hideo Saito. Before joining NVIDIA Research, he was a computer vision engineer at Konica Minolta, Inc. in Japan, working on human action recognition. His research interest is mainly in Human activity analysis from multi-sensory data (e.g., audio-visual, audio-visual-language).