Prediction of Adeno-Associated Virus Fitness with a Protein Language-Based Machine Learning Model.

IF 3.9 3区 医学 Q2 BIOTECHNOLOGY & APPLIED MICROBIOLOGY
Human gene therapy Pub Date : 2025-05-01 Epub Date: 2025-04-16 DOI:10.1089/hum.2024.227
Jason Wu, Yu Qiu, Eugenia Lyashenko, Tess Torregrosa, Edith L Pfister, Michael J Ryan, Christian Mueller, Sourav R Choudhury
{"title":"Prediction of Adeno-Associated Virus Fitness with a Protein Language-Based Machine Learning Model.","authors":"Jason Wu, Yu Qiu, Eugenia Lyashenko, Tess Torregrosa, Edith L Pfister, Michael J Ryan, Christian Mueller, Sourav R Choudhury","doi":"10.1089/hum.2024.227","DOIUrl":null,"url":null,"abstract":"<p><p>Adeno-associated virus (AAV)-based therapeutics have the potential to transform the lives of patients by delivering one-time treatments for a variety of diseases. However, a critical challenge to their widespread adoption and distribution is the high cost of goods. Reducing manufacturing costs by developing AAV capsids with improved yield, or fitness, is key to making gene therapies more affordable. AAV fitness is largely determined by the amino acid sequence of the capsid, however, engineered AAVs are rarely optimized for manufacturability. Here, we report a state-of-the art machine learning (ML) model that predicts the fitness of AAV2 capsid mutants based on the amino acid sequence of the capsid monomer. By combining a protein language model (PLM) and classical ML techniques, our model achieved a significantly high prediction accuracy (Pearson correlation = 0.818) for capsid fitness. Importantly, tests on completely independent datasets showed robustness and generalizability of our model, even for multimutant AAV capsids. Our accurate ML-based model can be used as a surrogate for laborious <i>in vitro</i> experiments, thus saving time and resources, and can be deployed to increase the fitness of clinical AAV capsids to make gene therapies economically viable for patients.</p>","PeriodicalId":13007,"journal":{"name":"Human gene therapy","volume":"36 9-10","pages":"823-829"},"PeriodicalIF":3.9000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human gene therapy","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1089/hum.2024.227","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/16 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Adeno-associated virus (AAV)-based therapeutics have the potential to transform the lives of patients by delivering one-time treatments for a variety of diseases. However, a critical challenge to their widespread adoption and distribution is the high cost of goods. Reducing manufacturing costs by developing AAV capsids with improved yield, or fitness, is key to making gene therapies more affordable. AAV fitness is largely determined by the amino acid sequence of the capsid, however, engineered AAVs are rarely optimized for manufacturability. Here, we report a state-of-the art machine learning (ML) model that predicts the fitness of AAV2 capsid mutants based on the amino acid sequence of the capsid monomer. By combining a protein language model (PLM) and classical ML techniques, our model achieved a significantly high prediction accuracy (Pearson correlation = 0.818) for capsid fitness. Importantly, tests on completely independent datasets showed robustness and generalizability of our model, even for multimutant AAV capsids. Our accurate ML-based model can be used as a surrogate for laborious in vitro experiments, thus saving time and resources, and can be deployed to increase the fitness of clinical AAV capsids to make gene therapies economically viable for patients.

基于蛋白质语言的机器学习模型预测腺相关病毒适应度。
基于腺相关病毒(AAV)的治疗方法有可能通过为多种疾病提供一次性治疗来改变患者的生活。然而,对它们的广泛采用和分销的一个关键挑战是商品的高成本。通过开发产量或适应性更高的AAV衣壳来降低制造成本,是使基因疗法更经济实惠的关键。AAV的适应性很大程度上取决于衣壳的氨基酸序列,然而,工程AAV很少针对可制造性进行优化。在这里,我们报告了一个最先进的机器学习(ML)模型,该模型基于衣壳单体的氨基酸序列预测AAV2衣壳突变体的适合度。通过结合蛋白质语言模型(PLM)和经典ML技术,我们的模型对衣壳适应度的预测精度非常高(Pearson相关= 0.818)。重要的是,在完全独立的数据集上的测试显示了我们的模型的稳健性和通用性,即使对于多突变的AAV衣壳也是如此。我们精确的基于ml的模型可以代替繁琐的体外实验,从而节省时间和资源,并可用于提高临床AAV衣壳的适应度,使基因治疗对患者具有经济可行性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Human gene therapy
Human gene therapy 医学-生物工程与应用微生物
CiteScore
6.50
自引率
4.80%
发文量
131
审稿时长
4-8 weeks
期刊介绍: Human Gene Therapy is the premier, multidisciplinary journal covering all aspects of gene therapy. The Journal publishes in-depth coverage of DNA, RNA, and cell therapies by delivering the latest breakthroughs in research and technologies. Human Gene Therapy provides a central forum for scientific and clinical information, including ethical, legal, regulatory, social, and commercial issues, which enables the advancement and progress of therapeutic procedures leading to improved patient outcomes, and ultimately, to curing diseases.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信