{"title":"AggNet: Advancing protein aggregation analysis through deep learning and protein language model.","authors":"Wenjia He, Xiaopeng Xu, Haoyang Li, Juexiao Zhou, Xin Gao","doi":"10.1002/pro.70031","DOIUrl":null,"url":null,"abstract":"<p><p>Protein aggregation is critical to various biological and pathological processes. Besides, it is also an important property in biotherapeutic development. However, experimental methods to profile protein aggregation are costly and labor-intensive, driving the need for more efficient computational alternatives. In this study, we introduce \"AggNet,\" a novel deep learning framework based on the protein language model ESM2 and AlphaFold2, which utilizes physicochemical, evolutionary, and structural information to discriminate amyloid and non-amyloid peptides and identify aggregation-prone regions (APRs) in diverse proteins. Benchmark comparisons show that AggNet outperforms existing methods and achieves state-of-the-art performance on protein aggregation prediction. Also, the predictive ability of AggNet is stable across proteins with different secondary structures. Feature analysis and visualizations prove that the model effectively captures peptides' physicochemical properties effectively, thereby offering enhanced interpretability. Further validation through a case study on MEDI1912 confirms AggNet's practical utility in analyzing protein aggregation and guiding mutation for aggregation mitigation. This study enhances computational tools for predicting protein aggregation and highlights the potential of AggNet in protein engineering. Finally, to improve the accessibility of AggNet, the source code can be accessed at: https://github.com/Hill-Wenka/AggNet.</p>","PeriodicalId":20761,"journal":{"name":"Protein Science","volume":"34 2","pages":"e70031"},"PeriodicalIF":4.5000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11751882/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Protein Science","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/pro.70031","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Protein aggregation is critical to various biological and pathological processes. Besides, it is also an important property in biotherapeutic development. However, experimental methods to profile protein aggregation are costly and labor-intensive, driving the need for more efficient computational alternatives. In this study, we introduce "AggNet," a novel deep learning framework based on the protein language model ESM2 and AlphaFold2, which utilizes physicochemical, evolutionary, and structural information to discriminate amyloid and non-amyloid peptides and identify aggregation-prone regions (APRs) in diverse proteins. Benchmark comparisons show that AggNet outperforms existing methods and achieves state-of-the-art performance on protein aggregation prediction. Also, the predictive ability of AggNet is stable across proteins with different secondary structures. Feature analysis and visualizations prove that the model effectively captures peptides' physicochemical properties effectively, thereby offering enhanced interpretability. Further validation through a case study on MEDI1912 confirms AggNet's practical utility in analyzing protein aggregation and guiding mutation for aggregation mitigation. This study enhances computational tools for predicting protein aggregation and highlights the potential of AggNet in protein engineering. Finally, to improve the accessibility of AggNet, the source code can be accessed at: https://github.com/Hill-Wenka/AggNet.
期刊介绍:
Protein Science, the flagship journal of The Protein Society, is a publication that focuses on advancing fundamental knowledge in the field of protein molecules. The journal welcomes original reports and review articles that contribute to our understanding of protein function, structure, folding, design, and evolution.
Additionally, Protein Science encourages papers that explore the applications of protein science in various areas such as therapeutics, protein-based biomaterials, bionanotechnology, synthetic biology, and bioelectronics.
The journal accepts manuscript submissions in any suitable format for review, with the requirement of converting the manuscript to journal-style format only upon acceptance for publication.
Protein Science is indexed and abstracted in numerous databases, including the Agricultural & Environmental Science Database (ProQuest), Biological Science Database (ProQuest), CAS: Chemical Abstracts Service (ACS), Embase (Elsevier), Health & Medical Collection (ProQuest), Health Research Premium Collection (ProQuest), Materials Science & Engineering Database (ProQuest), MEDLINE/PubMed (NLM), Natural Science Collection (ProQuest), and SciTech Premium Collection (ProQuest).