Accelerating bioactive peptide discovery via mutual information-based meta-learning.

Wenjia He, Yi Jiang, Junru Jin, Zhongshen Li, Jiaojiao Zhao, Balachandran Manavalan, Ran Su, Xin Gao, Leyi Wei

Research output: Contribution to journalArticlepeer-review

38 Scopus citations

Abstract

Recently, machine learning methods have been developed to identify various peptide bio-activities. However, due to the lack of experimentally validated peptides, machine learning methods cannot provide a sufficiently trained model, easily resulting in poor generalizability. Furthermore, there is no generic computational framework to predict the bioactivities of different peptides. Thus, a natural question is whether we can use limited samples to build an effective predictive model for different kinds of peptides. To address this question, we propose Mutual Information Maximization Meta-Learning (MIMML), a novel meta-learning-based predictive model for bioactive peptide discovery. Using few samples from various functional peptides, MIMML can sufficiently learn the discriminative information amongst various functions and characterize functional differences. Experimental results show excellent performance of MIMML though using far fewer training samples as compared to the state-of-the-art methods. We also decipher the latent relationships among different kinds of functions to understand what meta-model learned to improve a specific task. In summary, this study is a pioneering work in the field of functional peptide mining and provides the first-of-its-kind solution for few-sample learning problems in biological sequence analysis, accelerating the new functional peptide discovery. The source codes and datasets are available on https://github.com/TearsWaiting/MIMML.
Original languageEnglish (US)
JournalBriefings in bioinformatics
DOIs
StatePublished - Dec 9 2021

Bibliographical note

KAUST Repository Item: Exported on 2022-01-13
Acknowledged KAUST grant number(s): FCC/1/1976-04-01, REI/1/0018-01-01, REI/1/4473-01-01, REI/1/4742-01-01, URF/1/4098-01-01
Acknowledgements: Natural Science Foundation of China (62072329 and 62071278), the King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Award No. FCC/1/1976-04-01, URF/1/4098-01-01, REI/1/0018-01-01, REI/1/4473-01-01 and REI/1/4742-01-01.

ASJC Scopus subject areas

  • Molecular Biology
  • Information Systems

Fingerprint

Dive into the research topics of 'Accelerating bioactive peptide discovery via mutual information-based meta-learning.'. Together they form a unique fingerprint.

Cite this