Cross-modal hashing has been intensively studied to efficiently retrieve multi-modal data across modalities. Supervised cross-modal hashing methods leverage the labels of training data to improve the retrieval performance. However, most of these methods still assume that the semantic labels of training data are ideally complete and noise-free. This assumption is too optimistic for real multi-modal data, whose label annotations are, in essence, error-prone. To achieve effective cross-modal hashing on multi-modal data with noisy labels, we introduce an end-to-end solution called Noise-robust Deep Cross-modal Hashing (NrDCMH). NrDCMH contains two main components: a noise instance detection module and a hash code learning module. In the noise detection module, NrDCMH firstly detects noisy training instance pairs based on the margin between the label similarity and feature similarity, and specifies weights to pairs using the margin. In the hash learning module, NrDCMH incorporates the weights into a likelihood loss function to reduce the impact of instances with noisy labels and to learn compatible deep features by applying different neural networks on multi-modality data in a unified end-to-end framework. Experimental results on multi-modal benchmark datasets demonstrate that NrDCMH performs significantly better than competitive methods with noisy label annotations. NrDCMH also achieves competitive results in ‘noise-free’ scenarios.
Bibliographical noteKAUST Repository Item: Exported on 2021-11-20
Acknowledgements: This work was supported by the Natural Science Foundation of China (61872300 and 62031003).
ASJC Scopus subject areas
- Artificial Intelligence
- Theoretical Computer Science
- Information Systems and Management
- Control and Systems Engineering
- Computer Science Applications