Abstract
Videos show continuous events, yet most - if not all - video synthesis frameworks treat them discretely in time. In this work, we think of videos of what they should be - time-continuous signals, and extend the paradigm of neural representations to build a continuous-time video generator. For this, we first design continuous motion representations through the lens of positional embeddings. Then, we explore the question of training on very sparse videos and demon-strate that a good generator can be learned by using as few as 2 frames per clip. After that, we rethink the traditional image + video discriminators pair and design a holistic dis-criminator that aggregates temporal information by simply concatenating frames' features. This decreases the training cost and provides richer learning signal to the generator, making it possible to train directly on 10242 videos for the first time. We build our model on top of StyleGAN2 and it is just ≈5% more expensive to train at the same resolution while achieving almost the same image quality. Moreover, our latent space features similar properties, enabling spa-tial manipulations that our method can propagate in time. We can generate arbitrarily long videos at arbitrary high frame rate, while prior work struggles to generate even 64 frames at a fixed rate. Our model is tested on four mod-ern 2562 and one 10242 -resolution video synthesis bench-marks. In terms of sheer metrics, it performs on average ≈30% better than the closest runner-up. Project website: https://universome.github.io/stylegan-v.
Original language | English (US) |
---|---|
Title of host publication | Proceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022 |
Publisher | IEEE Computer Society |
Pages | 3616-3626 |
Number of pages | 11 |
ISBN (Electronic) | 9781665469463 |
DOIs | |
State | Published - 2022 |
Event | 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022 - New Orleans, United States Duration: Jun 19 2022 → Jun 24 2022 |
Publication series
Name | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition |
---|---|
Volume | 2022-June |
ISSN (Print) | 1063-6919 |
Conference
Conference | 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022 |
---|---|
Country/Territory | United States |
City | New Orleans |
Period | 06/19/22 → 06/24/22 |
Bibliographical note
Publisher Copyright:© 2022 IEEE.
Keywords
- Image and video synthesis and generation
ASJC Scopus subject areas
- Software
- Computer Vision and Pattern Recognition