DEEPV2D: VIDEO TO DEPTH WITH DIFFERENTIABLE STRUCTURE FROM MOTION

Zachary Teed, Jia Deng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

47 Scopus citations

Abstract

We propose DeepV2D, an end-to-end deep learning architecture for predicting depth from video. DeepV2D combines the representation ability of neural networks with the geometric principles governing image formation. We compose a collection of classical geometric algorithms, which are converted into trainable modules and combined into an end-to-end differentiable architecture. DeepV2D interleaves two stages: motion estimation and depth estimation. During inference, motion and depth estimation are alternated and converge to accurate depth.
Original languageEnglish (US)
Title of host publication8th International Conference on Learning Representations, ICLR 2020
PublisherInternational Conference on Learning Representations, ICLR
StatePublished - Jan 1 2020
Externally publishedYes

Bibliographical note

KAUST Repository Item: Exported on 2023-04-05
Acknowledged KAUST grant number(s): OSR-2015-CRG4-2639
Acknowledgements: We would like to thank Zhaoheng Zheng for helping with baseline experiments. This work was partially funded by the Toyota Research Institute, the King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Award No. OSR-2015-CRG4-2639, and the National Science Foundation under Grant No. 1617767.
This publication acknowledges KAUST support, but has no KAUST affiliated authors.

Fingerprint

Dive into the research topics of 'DEEPV2D: VIDEO TO DEPTH WITH DIFFERENTIABLE STRUCTURE FROM MOTION'. Together they form a unique fingerprint.

Cite this