Jason Wu, Yu Qiu, Eugenia Lyashenko, Tess Torregrosa, Edith L Pfister, Michael J Ryan, Christian Mueller, Sourav R Choudhury
{"title":"基于蛋白质语言的机器学习模型预测腺相关病毒适应度。","authors":"Jason Wu, Yu Qiu, Eugenia Lyashenko, Tess Torregrosa, Edith L Pfister, Michael J Ryan, Christian Mueller, Sourav R Choudhury","doi":"10.1089/hum.2024.227","DOIUrl":null,"url":null,"abstract":"<p><p>Adeno-associated virus (AAV)-based therapeutics have the potential to transform the lives of patients by delivering one-time treatments for a variety of diseases. However, a critical challenge to their widespread adoption and distribution is the high cost of goods. Reducing manufacturing costs by developing AAV capsids with improved yield, or fitness, is key to making gene therapies more affordable. AAV fitness is largely determined by the amino acid sequence of the capsid, however, engineered AAVs are rarely optimized for manufacturability. Here, we report a state-of-the art machine learning (ML) model that predicts the fitness of AAV2 capsid mutants based on the amino acid sequence of the capsid monomer. By combining a protein language model (PLM) and classical ML techniques, our model achieved a significantly high prediction accuracy (Pearson correlation = 0.818) for capsid fitness. Importantly, tests on completely independent datasets showed robustness and generalizability of our model, even for multimutant AAV capsids. Our accurate ML-based model can be used as a surrogate for laborious <i>in vitro</i> experiments, thus saving time and resources, and can be deployed to increase the fitness of clinical AAV capsids to make gene therapies economically viable for patients.</p>","PeriodicalId":13007,"journal":{"name":"Human gene therapy","volume":"36 9-10","pages":"823-829"},"PeriodicalIF":3.9000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Prediction of Adeno-Associated Virus Fitness with a Protein Language-Based Machine Learning Model.\",\"authors\":\"Jason Wu, Yu Qiu, Eugenia Lyashenko, Tess Torregrosa, Edith L Pfister, Michael J Ryan, Christian Mueller, Sourav R Choudhury\",\"doi\":\"10.1089/hum.2024.227\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Adeno-associated virus (AAV)-based therapeutics have the potential to transform the lives of patients by delivering one-time treatments for a variety of diseases. However, a critical challenge to their widespread adoption and distribution is the high cost of goods. Reducing manufacturing costs by developing AAV capsids with improved yield, or fitness, is key to making gene therapies more affordable. AAV fitness is largely determined by the amino acid sequence of the capsid, however, engineered AAVs are rarely optimized for manufacturability. Here, we report a state-of-the art machine learning (ML) model that predicts the fitness of AAV2 capsid mutants based on the amino acid sequence of the capsid monomer. By combining a protein language model (PLM) and classical ML techniques, our model achieved a significantly high prediction accuracy (Pearson correlation = 0.818) for capsid fitness. Importantly, tests on completely independent datasets showed robustness and generalizability of our model, even for multimutant AAV capsids. Our accurate ML-based model can be used as a surrogate for laborious <i>in vitro</i> experiments, thus saving time and resources, and can be deployed to increase the fitness of clinical AAV capsids to make gene therapies economically viable for patients.</p>\",\"PeriodicalId\":13007,\"journal\":{\"name\":\"Human gene therapy\",\"volume\":\"36 9-10\",\"pages\":\"823-829\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Human gene therapy\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1089/hum.2024.227\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/4/16 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"BIOTECHNOLOGY & APPLIED MICROBIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human gene therapy","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1089/hum.2024.227","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/16 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
Prediction of Adeno-Associated Virus Fitness with a Protein Language-Based Machine Learning Model.
Adeno-associated virus (AAV)-based therapeutics have the potential to transform the lives of patients by delivering one-time treatments for a variety of diseases. However, a critical challenge to their widespread adoption and distribution is the high cost of goods. Reducing manufacturing costs by developing AAV capsids with improved yield, or fitness, is key to making gene therapies more affordable. AAV fitness is largely determined by the amino acid sequence of the capsid, however, engineered AAVs are rarely optimized for manufacturability. Here, we report a state-of-the art machine learning (ML) model that predicts the fitness of AAV2 capsid mutants based on the amino acid sequence of the capsid monomer. By combining a protein language model (PLM) and classical ML techniques, our model achieved a significantly high prediction accuracy (Pearson correlation = 0.818) for capsid fitness. Importantly, tests on completely independent datasets showed robustness and generalizability of our model, even for multimutant AAV capsids. Our accurate ML-based model can be used as a surrogate for laborious in vitro experiments, thus saving time and resources, and can be deployed to increase the fitness of clinical AAV capsids to make gene therapies economically viable for patients.
期刊介绍:
Human Gene Therapy is the premier, multidisciplinary journal covering all aspects of gene therapy. The Journal publishes in-depth coverage of DNA, RNA, and cell therapies by delivering the latest breakthroughs in research and technologies. Human Gene Therapy provides a central forum for scientific and clinical information, including ethical, legal, regulatory, social, and commercial issues, which enables the advancement and progress of therapeutic procedures leading to improved patient outcomes, and ultimately, to curing diseases.