SST: Single-Stream Temporal Action Proposals

Shyamal Buch, Victor Escorcia, Chuanqi Shen, Bernard Ghanem, Juan Carlos Niebles

Research output: Chapter in Book/Report/Conference proceedingConference contribution

355 Scopus citations

Abstract

Our paper presents a new approach for temporal detection of human actions in long, untrimmed video sequences. We introduce Single-Stream Temporal Action Proposals (SST), a new effective and efficient deep architecture for the generation of temporal action proposals. Our network can run continuously in a single stream over very long input video sequences, without the need to divide input into short overlapping clips or temporal windows for batch processing. We demonstrate empirically that our model outperforms the state-of-the-art on the task of temporal action proposal generation, while achieving some of the fastest processing speeds in the literature. Finally, we demonstrate that using SST proposals in conjunction with existing action classifiers results in improved state-of-the-art temporal action detection performance.
Original languageEnglish (US)
Title of host publication2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages6373-6382
Number of pages10
ISBN (Print)9781538604571
DOIs
StatePublished - Nov 9 2017

Bibliographical note

KAUST Repository Item: Exported on 2020-10-01
Acknowledgements: This research was sponsored, in part, by the Stanford AI Lab-Toyota Center for Artificial Intelligence Research, Toyota Research Institute (TRI), and by the King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research. This article reflects the opinions and conclusions of its authors and not TRI or any other Toyota entity. We thank our anonymous reviewers, De-An Huang, Oliver Groth, Fabian Caba, Joseph Lim, Jingwei Ji, and Fei-Fei Li for helpful comments and discussion.

Fingerprint

Dive into the research topics of 'SST: Single-Stream Temporal Action Proposals'. Together they form a unique fingerprint.

Cite this