Graph embedding and unsupervised learning predict genomic sub-compartments from HiC chromatin interaction data

Haitham Ashoor, Xiaowen Chen, Wojciech Rosikiewicz, Jiahui Wang, Albert Cheng, Ping Wang, Yijun Ruan, Sheng Li

Research output: Contribution to journalArticlepeer-review

14 Scopus citations


Chromatin interaction studies can reveal how the genome is organized into spatially confined sub-compartments in the nucleus. However, accurately identifying sub-compartments from chromatin interaction data remains a challenge in computational biology. Here, we present Sub-Compartment Identifier (SCI), an algorithm that uses graph embedding followed by unsupervised learning to predict sub-compartments using Hi-C chromatin interaction data. We find that the network topological centrality and clustering performance of SCI sub-compartment predictions are superior to those of hidden Markov model (HMM) subcompartment predictions. Moreover, using orthogonal Chromatin Interaction Analysis by insitu Paired-End Tag Sequencing (ChIA-PET) data, we confirmed that SCI sub compartment prediction outperforms HMM. We show that SCI-predicted sub-compartments have distinct epigenetic marks, transcriptional activities, and transcription factor enrichment. Moreover, we present a deep neural network to predict sub-compartments using epigenome, replication timing, and sequence data. Our neural network predicts more accurate sub-compartment predictions when SCI-determined sub-compartments are used as labels for training
Original languageEnglish (US)
JournalNature Communications
Issue number1
StatePublished - Mar 3 2020
Externally publishedYes

Cite this