AggNet: Advancing protein aggregation analysis through deep learning and protein language model

Wenjia He, Xiaopeng Xu, Haoyang Li, Juexiao Zhou, Xin Gao*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Protein aggregation is critical to various biological and pathological processes. Besides, it is also an important property in biotherapeutic development. However, experimental methods to profile protein aggregation are costly and labor-intensive, driving the need for more efficient computational alternatives. In this study, we introduce “AggNet,” a novel deep learning framework based on the protein language model ESM2 and AlphaFold2, which utilizes physicochemical, evolutionary, and structural information to discriminate amyloid and non-amyloid peptides and identify aggregation-prone regions (APRs) in diverse proteins. Benchmark comparisons show that AggNet outperforms existing methods and achieves state-of-the-art performance on protein aggregation prediction. Also, the predictive ability of AggNet is stable across proteins with different secondary structures. Feature analysis and visualizations prove that the model effectively captures peptides' physicochemical properties effectively, thereby offering enhanced interpretability. Further validation through a case study on MEDI1912 confirms AggNet's practical utility in analyzing protein aggregation and guiding mutation for aggregation mitigation. This study enhances computational tools for predicting protein aggregation and highlights the potential of AggNet in protein engineering. Finally, to improve the accessibility of AggNet, the source code can be accessed at: https://github.com/Hill-Wenka/AggNet.

Original languageEnglish (US)
Article numbere70031
JournalProtein Science
Volume34
Issue number2
DOIs
StatePublished - Feb 2025

Bibliographical note

Publisher Copyright:
© 2025 The Protein Society.

Keywords

  • amyloid
  • APR
  • computational biology
  • machine learning
  • protein aggregation
  • protein engineering

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology

Fingerprint

Dive into the research topics of 'AggNet: Advancing protein aggregation analysis through deep learning and protein language model'. Together they form a unique fingerprint.

Cite this