Single-photon light detection and ranging (LiDAR) techniques use emerging single-photon detectors (SPADs) to push 3D imaging capabilities to unprecedented ranges. However, it remains challenging to robustly estimate scene depth from the noisy and otherwise corrupted measurements recorded by a SPAD. Here, we propose a deep sensor fusion strategy that combines corrupted SPAD data and a conventional 2D image to estimate the depth of a scene. Our primary contribution is a neural network architecture—SPADnet—that uses a monocular depth estimation algorithm together with a SPAD denoising and sensor fusion strategy. This architecture, together with several techniques in network training, achieves state-of-the-art results for RGB-SPAD fusion with simulated and captured data. Moreover, SPADnet is more computationally efficient than previous RGB-SPAD fusion networks.
|Original language||English (US)|
|State||Published - Apr 20 2020|
Bibliographical noteKAUST Repository Item: Exported on 2020-10-01
Acknowledgements: D.L. was supported by a Stanford Graduate Fellowship. G.W. was supported by an NSF CAREER Award (IIS 1553333), a Sloan Fellowship, by the KAUST Office of Sponsored Research through the Visual Computing Center CCF grant, the DARPA REVEAL program, and a PECASE by the U.S. Army Research Office. The authors would like to thank Matthew O’Toole for his work on the prototype and data acquistion