Unsupervised Video Interpolation Using Cycle Consistency
Published in International Conference on Computer Vision (ICCV), 2019
Recommended citation: Fitsum A. Reda, Deqing Sun, Aysegul Dundar, Mohammad Shoeybi, Guilin Liu, Kevin J. Shih, Andrew Tao, Jan Kautz, Bryan Catanzaro, "Unsupervised Video Interpolation Using Cycle Consistency". In ICCV 2019. https://arxiv.org/abs/1906.05928
Learning to synthesize high frame rate videos via interpolation requires large quantities of high frame rate training videos, which, however, are scarce, especially at high resolutions. Here, we propose unsupervised techniques to synthesize high frame rate videos directly from low frame rate videos using cycle consistency. For a triplet of consecutive frames, we optimize models to minimize the discrepancy between the center frame and its cycle reconstruction, obtained by interpolating back from interpolated intermediate frames. This simple unsupervised constraint alone achieves results comparable with supervision using the ground truth intermediate frames. We further introduce a pseudo supervised loss term that enforces the interpolated frames to be consistent with predictions of a pre-trained interpolation model. The pseudo supervised loss term, used together with cycle consistency, can effectively adapt a pre-trained model to a new target domain. With no additional data and in a completely unsupervised fashion, our techniques significantly improve pre-trained models on new target domains, increasing PSNR values from 32.84dB to 33.05dB on the Slowflow and from 31.82dB to 32.53dB on the Sintel evaluation datasets.
Videos : Results and Comparisions
Upscaling framerate by 4x or 8x using our unsupervised techniques. We use the Super SloMo multi-frame interpolation model as our base network. Our techniques are, however, general and can be applied on any video interpolation model, for instance DVF as described in our paper. RGB flag indicate synthesized frame.
Fully Unsupervised Training for 4x Framerate Upscaling
Synthesis of three intermediate frames for every pair of input frames on an example YouTube-8M video. Download.
Unsupervised Fine-tuning for 8x Framerate Upscaling
Comparision of a Supervised Pre-training (baseline) with Unsupervised Fine-tuning (proposed) on Super SloMo for an example Slowflow video. Download.
Comparision of a Supervised Pre-training (baseline) with Unsupervised Fine-tuning (proposed) on Super SloMo for an example Slowflow video. Download.