A Coupled Hidden Conditional Random Field Model for Simultaneous Face Clustering and Naming in Videos

Yifan Zhang, Zhiqiang Tang, Baoyuan Wu, Qiang Ji, Hanqing Lu

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

For face naming in TV series or movies, a typical way is using subtitles/script alignment to get the time stamps of the names, and tagging them to the faces. We study the problem of face naming in videos when subtitles are not available. To this end, we divide the problem into two tasks: face clustering which groups the faces depicting a certain person into a cluster, and name assignment which associates a name to each face. Each task is formulated as a structured prediction problem and modeled by a hidden conditional random field (HCRF) model. We argue that the two tasks are correlated problems whose outputs can provide prior knowledge of the target prediction for each other. The two HCRFs are coupled in a unified graphical model called coupled HCRF where the joint dependence of the cluster labels and face name association is naturally embedded in the correlation between the two HCRFs. We provide an effective algorithm to optimize the two HCRFs iteratively and the performance of the two tasks on real-world data set can be both improved.
Original languageEnglish (US)
Pages (from-to)5780-5792
Number of pages13
JournalIEEE Transactions on Image Processing
Volume25
Issue number12
DOIs
StatePublished - Aug 18 2016
Externally publishedYes

Bibliographical note

KAUST Repository Item: Exported on 2020-10-01
Acknowledgements: This work was supported in part by the 863 Program under Grant 2014AA015100, in part by the National Natural Science Foundation of China under Grant 61332016, Grant 61572500, Grant 61379100, and in part by the DARPA PerSEAS Program under Grant HR0011-10-C-0112. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Nikolaos V. Boulgouris.

Fingerprint

Dive into the research topics of 'A Coupled Hidden Conditional Random Field Model for Simultaneous Face Clustering and Naming in Videos'. Together they form a unique fingerprint.

Cite this