aMLProt:用于蛋白质应用的自动机器学习库。

IF 5.4
Ruite Xiang, Christian Domínguez-Dalmases, Albert Cañellas-Solé, Victor Guallar
{"title":"aMLProt:用于蛋白质应用的自动机器学习库。","authors":"Ruite Xiang, Christian Domínguez-Dalmases, Albert Cañellas-Solé, Victor Guallar","doi":"10.1093/bioinformatics/btaf543","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Machine learning tools have become increasingly common in biological research, driven by the emergence of pre-trained large language models. However, training effective models remains a complex task, since many choices influence their performance. AutoML (automated machine learning) approaches help address these challenges by streamlining the entire model development pipeline.</p><p><strong>Results: </strong>We developed aMLProt, an AutoML framework tailored specifically for protein applications, such as enzyme engineering and bioprospecting. It features a modular design, allowing each component to be used independently or in combination. Notably, aMLProt integrates 19 classifiers and 26 regressors, along with pre-trained protein language models. It also includes standalone applications proven useful for protein-related workflows. To enhance usability, aMLProt is integrated with Horus, a GUI-based application with a visual interface.</p><p><strong>Availability: </strong>aMLProt is available on https://github.com/etiur/aMLProt.git and https://doi.org/10.5281/zenodo.14971157; The aMLProt plugin is available via the official Horus Plugin Repository https://horus.bsc.es/repo/plugins/amlprot, and Horus itself can be freely downloaded from https://horus.bsc.es. Moreover, a demo of aMLProt can be found, without previous registration or download, at the horus.bsc.es/amlprot and horus.bsc.es/amlprot-suggest. The results and data from the pH optima regression model are available at: https://zenodo.org/records/15394097.</p><p><strong>Supplementary information: </strong>Supplementary data are available at Bioinformatics online.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"aMLProt: an automated machine learning library for protein applications.\",\"authors\":\"Ruite Xiang, Christian Domínguez-Dalmases, Albert Cañellas-Solé, Victor Guallar\",\"doi\":\"10.1093/bioinformatics/btaf543\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Motivation: </strong>Machine learning tools have become increasingly common in biological research, driven by the emergence of pre-trained large language models. However, training effective models remains a complex task, since many choices influence their performance. AutoML (automated machine learning) approaches help address these challenges by streamlining the entire model development pipeline.</p><p><strong>Results: </strong>We developed aMLProt, an AutoML framework tailored specifically for protein applications, such as enzyme engineering and bioprospecting. It features a modular design, allowing each component to be used independently or in combination. Notably, aMLProt integrates 19 classifiers and 26 regressors, along with pre-trained protein language models. It also includes standalone applications proven useful for protein-related workflows. To enhance usability, aMLProt is integrated with Horus, a GUI-based application with a visual interface.</p><p><strong>Availability: </strong>aMLProt is available on https://github.com/etiur/aMLProt.git and https://doi.org/10.5281/zenodo.14971157; The aMLProt plugin is available via the official Horus Plugin Repository https://horus.bsc.es/repo/plugins/amlprot, and Horus itself can be freely downloaded from https://horus.bsc.es. Moreover, a demo of aMLProt can be found, without previous registration or download, at the horus.bsc.es/amlprot and horus.bsc.es/amlprot-suggest. The results and data from the pH optima regression model are available at: https://zenodo.org/records/15394097.</p><p><strong>Supplementary information: </strong>Supplementary data are available at Bioinformatics online.</p>\",\"PeriodicalId\":93899,\"journal\":{\"name\":\"Bioinformatics (Oxford, England)\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":5.4000,\"publicationDate\":\"2025-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioinformatics (Oxford, England)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/bioinformatics/btaf543\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf543","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

动机:机器学习工具在生物研究中越来越普遍,这是由预训练的大型语言模型的出现所驱动的。然而,训练有效的模型仍然是一项复杂的任务,因为许多选择都会影响模型的性能。AutoML(自动机器学习)方法通过简化整个模型开发管道来帮助解决这些挑战。结果:我们开发了aMLProt,这是一个专门为蛋白质应用(如酶工程和生物勘探)量身定制的AutoML框架。它采用模块化设计,允许每个组件单独使用或组合使用。值得注意的是,aMLProt集成了19个分类器和26个回归器,以及预训练的蛋白质语言模型。它还包括独立的应用程序,这些应用程序已被证明对蛋白质相关的工作流程非常有用。为了增强可用性,aMLProt与Horus集成在一起,Horus是一个基于gui的应用程序,具有可视化界面。可用性:aMLProt可在https://github.com/etiur/aMLProt.git和https://doi.org/10.5281/zenodo.14971157上获得;aMLProt插件可通过官方的Horus插件库https://horus.bsc.es/repo/plugins/amlprot获得,Horus本身可以从https://horus.bsc.es免费下载。此外,在horus.bsc.es/ aMLProt和horus.bsc.es/ aMLProt -suggest可以找到aMLProt的演示,无需事先注册或下载。pH最优回归模型的结果和数据可在:https://zenodo.org/records/15394097.Supplementary information .补充数据可在Bioinformatics online上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
aMLProt: an automated machine learning library for protein applications.

Motivation: Machine learning tools have become increasingly common in biological research, driven by the emergence of pre-trained large language models. However, training effective models remains a complex task, since many choices influence their performance. AutoML (automated machine learning) approaches help address these challenges by streamlining the entire model development pipeline.

Results: We developed aMLProt, an AutoML framework tailored specifically for protein applications, such as enzyme engineering and bioprospecting. It features a modular design, allowing each component to be used independently or in combination. Notably, aMLProt integrates 19 classifiers and 26 regressors, along with pre-trained protein language models. It also includes standalone applications proven useful for protein-related workflows. To enhance usability, aMLProt is integrated with Horus, a GUI-based application with a visual interface.

Availability: aMLProt is available on https://github.com/etiur/aMLProt.git and https://doi.org/10.5281/zenodo.14971157; The aMLProt plugin is available via the official Horus Plugin Repository https://horus.bsc.es/repo/plugins/amlprot, and Horus itself can be freely downloaded from https://horus.bsc.es. Moreover, a demo of aMLProt can be found, without previous registration or download, at the horus.bsc.es/amlprot and horus.bsc.es/amlprot-suggest. The results and data from the pH optima regression model are available at: https://zenodo.org/records/15394097.

Supplementary information: Supplementary data are available at Bioinformatics online.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信