Our method is robust towards different video diffusion guidance, and can generate realistic motions with Modelscope guidance as well.
A baby panda eating ice cream.
A dog riding a skateboard.
A squirrel riding a motorcycle.
A cat singing.
Clown fish swimming through the coral reef.
Superhero dog with red cape flying through the sky.
A monkey eating a candy bar.
An emoji of a baby panda reading a book.