Scaling Video Training with Parallelism

We published a new blog post: Scaling Video Training with Parallelism.

Long-video training changes the unit of distributed computation: instead of only splitting across samples, the system must split inside one long video sample. The post discusses sequence parallelism for long-video understanding and generation, including LongVILA MM-SP and LongLive-2.0 Balanced SP.

Yukang Chen
Yukang Chen
Senior Research Scientist

Senior Research Scientist at NVIDIA Research.

Song Han
Song Han
Associate Professor

Song Han is an associate professor at MIT EECS.