Accessible, uniform protein property prediction with a scikit-learn based toolset AIDE.

IF 5.4
Evan Komp, Kristoffer E Johansson, Nicholas P Gauthier, Japheth E Gado, Kresten Lindorff-Larsen, Gregg T Beckham
{"title":"Accessible, uniform protein property prediction with a scikit-learn based toolset AIDE.","authors":"Evan Komp, Kristoffer E Johansson, Nicholas P Gauthier, Japheth E Gado, Kresten Lindorff-Larsen, Gregg T Beckham","doi":"10.1093/bioinformatics/btaf544","DOIUrl":null,"url":null,"abstract":"<p><strong>Summary: </strong>Protein property prediction via machine learning with and without labeled data is becoming increasingly powerful, yet methods are disparate and capabilities vary widely over applications. The software presented here, \"Artificial Intelligence Driven protein Estimation (AIDE),\" enables instantiating, optimizing, and testing many zero-shot and supervised property prediction methods for variants and variable length homologs in a single, reproducible notebook or script by defining a modular, standardized application programming interface (API) that is drop-in compatible with scikit-learn transformers and pipelines.</p><p><strong>Availability and implementation: </strong>AIDE is an installable, importable python package inheriting from scikit-learn classes and API and is installable on Windows, Mac, and Linux. Many of the wrapped models internal to AIDE will be effectively inaccessible without a GPU, and some assume CUDA. The newest stable, tested version can be found at https://github.com/beckham-lab/aide_predict and a full user guide and API reference can be found at https://beckham-lab.github.io/aide_predict/. Static versions of both at the time of writing can be found on Zenodo. (Komp and Beckham 2025).</p><p><strong>Supplementary information: </strong>Digital supplementary data contains API examples and a user guide. Appendix A and B provide PDFs of notebooks for showcases. Source data for figures are provided.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf544","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Summary: Protein property prediction via machine learning with and without labeled data is becoming increasingly powerful, yet methods are disparate and capabilities vary widely over applications. The software presented here, "Artificial Intelligence Driven protein Estimation (AIDE)," enables instantiating, optimizing, and testing many zero-shot and supervised property prediction methods for variants and variable length homologs in a single, reproducible notebook or script by defining a modular, standardized application programming interface (API) that is drop-in compatible with scikit-learn transformers and pipelines.

Availability and implementation: AIDE is an installable, importable python package inheriting from scikit-learn classes and API and is installable on Windows, Mac, and Linux. Many of the wrapped models internal to AIDE will be effectively inaccessible without a GPU, and some assume CUDA. The newest stable, tested version can be found at https://github.com/beckham-lab/aide_predict and a full user guide and API reference can be found at https://beckham-lab.github.io/aide_predict/. Static versions of both at the time of writing can be found on Zenodo. (Komp and Beckham 2025).

Supplementary information: Digital supplementary data contains API examples and a user guide. Appendix A and B provide PDFs of notebooks for showcases. Source data for figures are provided.

使用基于scikit-learn的工具集AIDE进行可访问的、统一的蛋白质性质预测。
摘要:通过机器学习进行蛋白质性质预测,无论有无标记数据,都变得越来越强大,但方法不同,能力在应用中差异很大。本文介绍的软件“人工智能驱动的蛋白质估计(AIDE)”,通过定义与scikit-learn变压器和管道兼容的模块化、标准化应用程序编程接口(API),可以在单个可复制的笔记本或脚本中实例化、优化和测试许多变量和可变长度同源物的零采样和监督属性预测方法。可用性和实现:AIDE是一个可安装的、可导入的python包,继承了scikit-learn类和API,可以安装在Windows、Mac和Linux上。AIDE内部的许多封装模型在没有GPU的情况下将无法有效访问,有些假设是CUDA。最新的稳定测试版本可以在https://github.com/beckham-lab/aide_predict上找到,完整的用户指南和API参考可以在https://beckham-lab.github.io/aide_predict/上找到。在撰写本文时,可以在Zenodo上找到两者的静态版本。(Komp and Beckham 2025)。补充信息:数字补充数据包含API示例和用户指南。附录A及B为展览提供笔记本的pdf格式。提供了数字的源数据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信