DeepViral: prediction of novel virus-host interactions from protein sequences and infectious disease phenotypes.

Wang Liu-Wei, Senay Kafkas, Jun Chen, Nicholas J Dimonaco, Jesper Tegner, Robert Hoehndorf

Research output: Contribution to journalArticlepeer-review

32 Scopus citations


MotivationInfectious diseases caused by novel viruses have become a major public health concern. Rapid identification of virus-host interactions can reveal mechanistic insights into infectious diseases and shed light on potential treatments. Current computational prediction methods for novel viruses are based mainly on protein sequences. However, it is not clear to what extent other important features, such as the symptoms caused by the viruses, could contribute to a predictor. Disease phenotypes (i.e., signs and symptoms) are readily accessible from clinical diagnosis and we hypothesize that they may act as a potential proxy and an additional source of information for the underlying molecular interactions between the pathogens and hosts.ResultsWe developed DeepViral, a deep learning based method that predicts protein-protein interactions (PPI) between humans and viruses. Motivated by the potential utility of infectious disease phenotypes, we first embedded human proteins and viruses in a shared space using their associated phenotypes and functions, supported by formalized background knowledge from biomedical ontologies. By jointly learning from protein sequences and phenotype features, DeepViral significantly improves over existing sequence-based methods for intra- and inter-species PPI prediction.AvailabilityCode and datasets for reproduction and customization are available at Prediction results for 14 virus families are available at
Original languageEnglish (US)
JournalBioinformatics (Oxford, England)
StatePublished - Mar 8 2021

Bibliographical note

KAUST Repository Item: Exported on 2021-03-11
Acknowledgements: We would like to thank Maxat Kulmanov and Mona Alshahrani for their advice on earlier versions of this work. We also thank Jeffery
Law for making public the mappings of the SARS-CoV-2 proteins. We acknowledge the use of computational resources from the KAUST
Supercomputing Core Laboratory.

ASJC Scopus subject areas

  • Biochemistry
  • Computational Theory and Mathematics
  • Computational Mathematics
  • Molecular Biology
  • Statistics and Probability
  • Computer Science Applications


Dive into the research topics of 'DeepViral: prediction of novel virus-host interactions from protein sequences and infectious disease phenotypes.'. Together they form a unique fingerprint.

Cite this