Attention-Based Multimodal Entity Linking with High-Quality Images

Li Zhang, Zhixu Li, Qiang Yang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

21 Scopus citations

Abstract

Multimodal entity linking (MEL) is an emerging research field which uses both textual and visual information to map an ambiguous mention to an entity in a knowledge base (KB). However, images do not always help, which may also backfire if they are irrelevant to the textual content at all. Besides, the existing efforts mainly focus on learning a representation of both mentions and entities from their textual and visual contexts, without considering the negative impact brought by noisy irrelevant images, which happens frequently with social media posts. In this paper, we propose a novel MEL model, which not only removes the negative impact of noisy images, but also uses multiple attention mechanism to better capture the connection between mention representation and its corresponding entity representation. Our empirical study on a large real data collection demonstrates the effectiveness of our approach.
Original languageEnglish (US)
Title of host publicationDatabase Systems for Advanced Applications
PublisherSpringer International Publishing
Pages533-548
Number of pages16
ISBN (Print)9783030731960
DOIs
StatePublished - Apr 6 2021

Bibliographical note

KAUST Repository Item: Exported on 2021-05-06
Acknowledgements: This research is supported by National Key R&D Program of China (No. 2018-AAA0101900), the Priority Academic Program Development of Jiangsu Higher Education Institutions, National Natural Science Foundation of China (Grant No. 62072323, 61632016), Natural Science Foundation of Jiangsu Province (No. BK20191420), and the Suda-Toycloud Data Intelligence Joint Laboratory.

Fingerprint

Dive into the research topics of 'Attention-Based Multimodal Entity Linking with High-Quality Images'. Together they form a unique fingerprint.

Cite this