NanoBinder: a machine learning assisted nanobody binding prediction tool using Rosetta energy scores

IF 7.1 2区化学 Q1 CHEMISTRY, MULTIDISCIPLINARY

Journal of Cheminformatics Pub Date : 2025-06-16 DOI:10.1186/s13321-025-01040-1

Palistha Shrestha, Chandana S. Talwar, Jeevan Kandel, Kwang-Hyun Park, Kil To Chong, Eui-Jeon Woo, Hilal Tayara

{"title":"NanoBinder: a machine learning assisted nanobody binding prediction tool using Rosetta energy scores","authors":"Palistha Shrestha, Chandana S. Talwar, Jeevan Kandel, Kwang-Hyun Park, Kil To Chong, Eui-Jeon Woo, Hilal Tayara","doi":"10.1186/s13321-025-01040-1","DOIUrl":null,"url":null,"abstract":"Nanobodies offer significant therapeutic potential due to their small size, stability, and versatility. Although advancements in computational protein design have made designing de novo nanobodies increasingly feasible, there are limited tools specifically tailored for this purpose. Rosetta with its specialized protocols, is a prominent tool for nanobody design but is limited by a high false-negative rate, necessitating extensive high-throughput screening. This results in increased costs, time, and labor due to the need for large-scale experimentation and detailed structural analysis. To address current challenges in nanobody design, we introduce NanoBinder, an interpretable machine learning model that predicts nanobody-antigen binding using Rosetta energy scores. NanoBinder utilizes a Random Forest model trained on experimentally validated complexes and can be seamlessly integrated into the Rosetta software. It employs SHAP summary plots for interpretability, which helps identify key features influencing binding interactions. Experimentally validated on forty-nine diverse nanobodies, NanoBinder accurately predicts non-binders and shows reasonable performance in identifying binders. This approach significantly enhances predictive accuracy, reduces the need for extensive experimental assays, and accelerates nanobody development, thereby offering a powerful tool to mitigate the costs, time, and labor associated with high-throughput screening. Scientific contribution This study introduces NanoBinder, a machine learning framework for predicting nanobody-antigen binding using Rosetta-derived energy features. Through rigorous experimental validation across diverse nanobody sets, NanoBinder enhances nanobody screening workflows by reducing false positives and minimizing reliance on extensive wet-lab assays. The approach bridges the gap between physics-based modeling and data-driven prediction in nanobody design.","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"227 1","pages":""},"PeriodicalIF":7.1000,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cheminformatics","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1186/s13321-025-01040-1","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

Nanobodies offer significant therapeutic potential due to their small size, stability, and versatility. Although advancements in computational protein design have made designing de novo nanobodies increasingly feasible, there are limited tools specifically tailored for this purpose. Rosetta with its specialized protocols, is a prominent tool for nanobody design but is limited by a high false-negative rate, necessitating extensive high-throughput screening. This results in increased costs, time, and labor due to the need for large-scale experimentation and detailed structural analysis. To address current challenges in nanobody design, we introduce NanoBinder, an interpretable machine learning model that predicts nanobody-antigen binding using Rosetta energy scores. NanoBinder utilizes a Random Forest model trained on experimentally validated complexes and can be seamlessly integrated into the Rosetta software. It employs SHAP summary plots for interpretability, which helps identify key features influencing binding interactions. Experimentally validated on forty-nine diverse nanobodies, NanoBinder accurately predicts non-binders and shows reasonable performance in identifying binders. This approach significantly enhances predictive accuracy, reduces the need for extensive experimental assays, and accelerates nanobody development, thereby offering a powerful tool to mitigate the costs, time, and labor associated with high-throughput screening. Scientific contribution This study introduces NanoBinder, a machine learning framework for predicting nanobody-antigen binding using Rosetta-derived energy features. Through rigorous experimental validation across diverse nanobody sets, NanoBinder enhances nanobody screening workflows by reducing false positives and minimizing reliance on extensive wet-lab assays. The approach bridges the gap between physics-based modeling and data-driven prediction in nanobody design.

查看原文本刊更多论文

NanoBinder：一个机器学习辅助纳米体结合预测工具，使用罗塞塔能量评分

纳米体由于其小尺寸、稳定性和多功能性而具有显著的治疗潜力。尽管计算蛋白质设计的进步使得设计从头开始的纳米体越来越可行，但专门为此目的量身定制的工具有限。Rosetta具有其专门的协议，是纳米体设计的重要工具，但受限于高假阴性率，需要广泛的高通量筛选。由于需要大规模的实验和详细的结构分析，这增加了成本、时间和劳动力。为了解决当前纳米体设计中的挑战，我们引入了NanoBinder，这是一种可解释的机器学习模型，可以使用Rosetta能量评分来预测纳米体-抗原结合。NanoBinder利用随机森林模型训练实验验证的复合物，可以无缝集成到Rosetta软件。它采用SHAP摘要图进行可解释性，这有助于确定影响绑定相互作用的关键特征。在49种不同的纳米体上进行了实验验证，NanoBinder可以准确地预测非结合物，并在识别结合物方面表现出合理的性能。这种方法显著提高了预测准确性，减少了对大量实验分析的需求，并加速了纳米体的开发，从而提供了一种强大的工具，以减轻与高通量筛选相关的成本、时间和劳动力。本研究介绍了NanoBinder，这是一个机器学习框架，用于使用罗塞塔衍生的能量特征预测纳米体抗原结合。通过对不同纳米体进行严格的实验验证，NanoBinder通过减少假阳性和最大限度地减少对大量湿实验室分析的依赖来增强纳米体筛选工作流程。该方法弥补了纳米体设计中基于物理的建模和数据驱动的预测之间的差距。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Cheminformatics CHEMISTRY, MULTIDISCIPLINARY-COMPUTER SCIENCE, INFORMATION SYSTEMS

CiteScore

14.10

自引率

7.00%

发文量

审稿时长

3 months

期刊介绍： Journal of Cheminformatics is an open access journal publishing original peer-reviewed research in all aspects of cheminformatics and molecular modelling. Coverage includes, but is not limited to: chemical information systems, software and databases, and molecular modelling, chemical structure representations and their use in structure, substructure, and similarity searching of chemical substance and chemical reaction databases, computer and molecular graphics, computer-aided molecular design, expert systems, QSAR, and data mining techniques.