Abstract
Document-level Relation Extraction (DocRE) is the task of extracting relational facts mentioned in the entire document. Despite its popularity, there are still two major difficulties with this task: (i) How to learn more informative embeddings for entity pairs? (ii) How to capture the crucial context describing the relation between an entity pair from the document? To tackle the first challenge, we propose to encode the document with a task-specific pre-trained encoder, where three tasks are involved in pre-training. While one novel task is designed to learn the relation semantic from diverse expressions by utilizing relation-aware pre-training data, the other two tasks, Masked Language Modeling (MLM) and Mention Reference Prediction (MRP), are adopted to enhance the encoder’s capacity in text understanding and coreference capturing. For addressing the second challenge, we craft a hierarchical attention mechanism to refine the context for entity pairs, which considers the embeddings from the encoder as well as the sequential distance information of mentions in the given document. Extensive experimental study on the benchmark dataset DocRED verifies that our method achieves better performance than the baselines.
Original language | English (US) |
---|---|
Title of host publication | Web Information Systems Engineering – WISE 2021 |
Publisher | Springer International Publishing |
Pages | 347-362 |
Number of pages | 16 |
ISBN (Print) | 9783030915599 |
DOIs | |
State | Published - Jan 1 2022 |
Bibliographical note
KAUST Repository Item: Exported on 2022-01-11Acknowledgements: We are grateful to Heng Ye, Jiaan Wang and all reviews for their constructive comments. This work was supported by the National Key R&D Program of China (No. 2018AAA0101900), the Priority Academic Program Development of Jiangsu Higher Education Institutions, National Natural Science Foundation of China (Grant No. 62072323, 61632016, 62102276), Natural Science Foundation of Jiangsu Province (No. BK20191420).