Efficient AI
Efficient AI
News
Publications
Light
Dark
Automatic
Long Video Training
Scaling Video Training with Parallelism
Long-video training changes the unit of distributed computation. This blog explains how sequence parallelism scales training when one video sample is too long for one GPU, comparing LongVILA MM-SP and LongLive-2.0 Balanced SP.
Yukang Chen
,
Luozhou Wang
,
Wei Huang
,
Shuai Yang
,
Weian Mao
,
Song Han
Jun 3, 2026
1 min read
Cite
×