Holistic in silico developability assessment of novel classes of small proteins using publicly available sequence-based predictors

IF 3 3区 生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY
Daniel A. M. Pais, Jan-Peter A. Mayer, Karin Felderer, Maria B. Batalha, Timo Eichner, Sofia T. Santos, Raman Kumar, Sandra D. Silva, Hitto Kaufmann
{"title":"Holistic in silico developability assessment of novel classes of small proteins using publicly available sequence-based predictors","authors":"Daniel A. M. Pais,&nbsp;Jan-Peter A. Mayer,&nbsp;Karin Felderer,&nbsp;Maria B. Batalha,&nbsp;Timo Eichner,&nbsp;Sofia T. Santos,&nbsp;Raman Kumar,&nbsp;Sandra D. Silva,&nbsp;Hitto Kaufmann","doi":"10.1007/s10822-024-00569-x","DOIUrl":null,"url":null,"abstract":"<div><p>The development of novel therapeutic proteins is a lengthy and costly process, with an average attrition rate of 91% (Thomas et al. Clinical Development Success Rates and Contributing Factors 2011–2020, 2021). To increase the probability of success and ensure robust drug supply beyond approval, it is essential to assess the developability profile of new potential drug candidates as early and broadly as possible in development (Jain et al. MAbs, 2023. https://doi.org/10.1016/j.copbio.2011.06.002). Predicting these properties in silico is expected to be the next leap in innovation as it would enable significantly reduced development timelines combined with broader screens at lower costs. However, developing predictive algorithms typically requires substantial datasets generated under very defined conditions, a limiting factor especially for new classes of therapeutic proteins that hold immense clinical promise. Here we describe a strategy for assessing the developability of a novel class of small therapeutic Anticalin® proteins using machine learning in conjunction with a knowledge-driven approach. The knowledge-driven approach considers developability attributes such as aggregation propensity, charge variants, immunogenicity, specificity, thermal stability, hydrophobicity, and potential post-translational modifications, to calculate a holistic developability score. Based on sequence-derived descriptors as input parameters we established novel statistical models designed to predict the developability scores for Anticalin proteins. The best models yielded low root mean square errors across the entire dataset and were further validated by removing input data from individual screening campaigns and predicting developability scores for those drug candidates. The adoption of the described workflow will enable significantly streamlined preclinical development of Anticalin drug candidates and could potentially be applied to other therapeutic protein scaffolds.</p>\n<div><figure><div><div><picture><source><img></source></picture></div></div></figure></div></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"38 1","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer-Aided Molecular Design","FirstCategoryId":"99","ListUrlMain":"https://link.springer.com/article/10.1007/s10822-024-00569-x","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

The development of novel therapeutic proteins is a lengthy and costly process, with an average attrition rate of 91% (Thomas et al. Clinical Development Success Rates and Contributing Factors 2011–2020, 2021). To increase the probability of success and ensure robust drug supply beyond approval, it is essential to assess the developability profile of new potential drug candidates as early and broadly as possible in development (Jain et al. MAbs, 2023. https://doi.org/10.1016/j.copbio.2011.06.002). Predicting these properties in silico is expected to be the next leap in innovation as it would enable significantly reduced development timelines combined with broader screens at lower costs. However, developing predictive algorithms typically requires substantial datasets generated under very defined conditions, a limiting factor especially for new classes of therapeutic proteins that hold immense clinical promise. Here we describe a strategy for assessing the developability of a novel class of small therapeutic Anticalin® proteins using machine learning in conjunction with a knowledge-driven approach. The knowledge-driven approach considers developability attributes such as aggregation propensity, charge variants, immunogenicity, specificity, thermal stability, hydrophobicity, and potential post-translational modifications, to calculate a holistic developability score. Based on sequence-derived descriptors as input parameters we established novel statistical models designed to predict the developability scores for Anticalin proteins. The best models yielded low root mean square errors across the entire dataset and were further validated by removing input data from individual screening campaigns and predicting developability scores for those drug candidates. The adoption of the described workflow will enable significantly streamlined preclinical development of Anticalin drug candidates and could potentially be applied to other therapeutic protein scaffolds.

Abstract Image

利用可公开获得的基于序列的预测因子,对新型小分子蛋白质进行整体硅学可开发性评估。
新型治疗蛋白的开发是一个漫长而昂贵的过程,平均损耗率高达 91%(Thomas 等人,《2011-2020 年临床开发成功率及诱因》,2021 年)。为了提高成功的概率并确保药物批准后的稳健供应,必须在开发过程中尽早、尽可能广泛地评估新的潜在候选药物的可开发性概况(Jain et al. MAbs, 2023. https://doi.org/10.1016/j.copbio.2011.06.002 )。在硅学中预测这些特性有望成为创新的下一个飞跃,因为这将大大缩短开发时间,并以更低的成本进行更广泛的筛选。然而,开发预测算法通常需要在非常明确的条件下生成大量数据集,尤其是对于具有巨大临床前景的新型治疗蛋白质而言,这是一个限制因素。在此,我们介绍了一种结合知识驱动方法的机器学习策略,用于评估一类新型小型治疗性 Anticalin® 蛋白的可开发性。知识驱动法考虑了可开发性属性,如聚集倾向、电荷变异、免疫原性、特异性、热稳定性、疏水性和潜在的翻译后修饰,从而计算出整体可开发性得分。根据序列衍生描述符作为输入参数,我们建立了新的统计模型,旨在预测安替卡林蛋白的可开发性得分。最佳模型在整个数据集中产生的均方根误差较低,并通过移除单个筛选活动的输入数据和预测这些候选药物的可开发性得分得到了进一步验证。采用所述工作流程将大大简化安替卡林候选药物的临床前开发,并有可能应用于其他治疗性蛋白质支架。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Computer-Aided Molecular Design
Journal of Computer-Aided Molecular Design 生物-计算机:跨学科应用
CiteScore
8.00
自引率
8.60%
发文量
56
审稿时长
3 months
期刊介绍: The Journal of Computer-Aided Molecular Design provides a form for disseminating information on both the theory and the application of computer-based methods in the analysis and design of molecules. The scope of the journal encompasses papers which report new and original research and applications in the following areas: - theoretical chemistry; - computational chemistry; - computer and molecular graphics; - molecular modeling; - protein engineering; - drug design; - expert systems; - general structure-property relationships; - molecular dynamics; - chemical database development and usage.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信