Abstract
Learning vector representations (i.e., embeddings) of nodes for graph-structured information network has attracted vast interest from both industry and academia. Most real-world networks exhibit a complex and heterogeneous format, enclosing high-order relationships and rich semantic information among nodes. However, existing heterogeneous network embedding (HNE) frameworks are commonly designed in a centralized fashion, i.e., all the data storage and learning process take place on a single machine. Hence, those HNE methods show severe performance bottlenecks when handling large-scale networks due to high consumption on memory, storage, and running time. In light of this, to cope with large-scale HNE tasks with strong efficiency and effectiveness guarantee, we propose Decentralized Deep Heterogeneous Hypergraph (DDHH) embedding framework in this paper. In DDHH, we innovatively formulate a large heterogeneous network as a hypergraph, where its hyperedges can connect a set of semantically similar nodes. Our framework then intelligently partitions the heterogeneous network using the identified hyperedges. Then, each resulted subnetwork is assigned to a distributed worker, which employs the deep information maximization theorem to locally learn node embeddings from the partition received. We further devise a novel embedding alignment scheme to precisely project independently learned node embeddings from all subnetworks onto a public vector space, thus allowing for downstream tasks. As shown from our experimental results, DDHH significantly improves the efficiency and accuracy of existing HNE models, and can easily scale up to large-scale heterogeneous networks.
Original language | English (US) |
---|---|
Title of host publication | 2021 IEEE 37th International Conference on Data Engineering (ICDE) |
Publisher | IEEE |
Pages | 2033-2038 |
Number of pages | 6 |
ISBN (Print) | 9781728191843 |
DOIs | |
State | Published - Apr 2021 |
Bibliographical note
KAUST Repository Item: Exported on 2021-12-15Acknowledgements: This work was supported by ARC Discovery Project (GrantNo.DP190101985, DP170103954).