iPro-MP:一个基于bert的预测多个原核启动子的模型

IF 10.1 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY
Wei Su, Yuhe Yang, Yafei Zhao, Shishi Yuan, Xueqin Xie, Yuduo Hao, Hongqi Zhang, Dongxin Ye, Hao Lyu, Hao Lin
{"title":"iPro-MP:一个基于bert的预测多个原核启动子的模型","authors":"Wei Su, Yuhe Yang, Yafei Zhao, Shishi Yuan, Xueqin Xie, Yuduo Hao, Hongqi Zhang, Dongxin Ye, Hao Lyu, Hao Lin","doi":"10.1186/s13059-025-03819-9","DOIUrl":null,"url":null,"abstract":"Promoters, as essential cis-regulatory elements in prokaryotes, govern gene expression by mediating RNA polymerase binding through core motifs and long-range regulatory interactions, playing a pivotal role in cell metabolism and environmental adaptation. Hence, accurate identification of prokaryotic promoters is vital for understanding their biological functions. However, the existing tools for predicting prokaryotic promoters are mainly concentrated on individual model organisms, and their prediction accuracy needs to be further improved. To address these gaps, we develop iPro-MP, a transformer-based prokaryotic promoter prediction framework that we systematically evaluate across 23 phylogenetically diverse species, including both model and non-model organisms. iPro-MP utilizes a multi-head attention mechanism to capture textual information in DNA sequences and effectively learns the hidden patterns. Cross-species prediction demonstrates the necessity of constructing species-specific models. Through a series of experiments, iPro-MP shows outstanding performance, with the AUC exceeding 0.9 in 18 out of 23 species. Our novel approach to predicting prokaryotic promoters, iPro-MP, provides the superiority to other existing tools, especially in predicting non-model organisms. Finally, for the convenience of other researchers, the source code and datasets of iPro-MP are freely available at https://github.com/Jackie-Suv/iPro-MP .","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"12 1","pages":""},"PeriodicalIF":10.1000,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"iPro-MP: a BERT-based model to predict multiple prokaryotic promoters\",\"authors\":\"Wei Su, Yuhe Yang, Yafei Zhao, Shishi Yuan, Xueqin Xie, Yuduo Hao, Hongqi Zhang, Dongxin Ye, Hao Lyu, Hao Lin\",\"doi\":\"10.1186/s13059-025-03819-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Promoters, as essential cis-regulatory elements in prokaryotes, govern gene expression by mediating RNA polymerase binding through core motifs and long-range regulatory interactions, playing a pivotal role in cell metabolism and environmental adaptation. Hence, accurate identification of prokaryotic promoters is vital for understanding their biological functions. However, the existing tools for predicting prokaryotic promoters are mainly concentrated on individual model organisms, and their prediction accuracy needs to be further improved. To address these gaps, we develop iPro-MP, a transformer-based prokaryotic promoter prediction framework that we systematically evaluate across 23 phylogenetically diverse species, including both model and non-model organisms. iPro-MP utilizes a multi-head attention mechanism to capture textual information in DNA sequences and effectively learns the hidden patterns. Cross-species prediction demonstrates the necessity of constructing species-specific models. Through a series of experiments, iPro-MP shows outstanding performance, with the AUC exceeding 0.9 in 18 out of 23 species. Our novel approach to predicting prokaryotic promoters, iPro-MP, provides the superiority to other existing tools, especially in predicting non-model organisms. Finally, for the convenience of other researchers, the source code and datasets of iPro-MP are freely available at https://github.com/Jackie-Suv/iPro-MP .\",\"PeriodicalId\":12611,\"journal\":{\"name\":\"Genome Biology\",\"volume\":\"12 1\",\"pages\":\"\"},\"PeriodicalIF\":10.1000,\"publicationDate\":\"2025-10-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genome Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s13059-025-03819-9\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOTECHNOLOGY & APPLIED MICROBIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genome Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13059-025-03819-9","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

启动子作为原核生物必不可少的顺式调控元件,通过核心基序和远程调控相互作用介导RNA聚合酶结合,调控基因表达,在细胞代谢和环境适应中发挥关键作用。因此,准确鉴定原核启动子对了解其生物学功能至关重要。然而,现有的原核启动子预测工具主要集中在单个模式生物上,其预测精度有待进一步提高。为了解决这些空白,我们开发了iPro-MP,这是一个基于转化子的原核启动子预测框架,我们系统地评估了23种系统发育不同的物种,包括模型生物和非模式生物。ippro - mp利用多头注意机制捕获DNA序列中的文本信息,并有效地学习隐藏的模式。跨物种预测表明了构建物种特异性模型的必要性。通过一系列实验,iPro-MP表现出优异的性能,在23个物种中有18个物种的AUC超过0.9。我们预测原核生物启动子的新方法iPro-MP与其他现有工具相比具有优势,特别是在预测非模式生物方面。最后,为了方便其他研究人员,ippro - mp的源代码和数据集可以在https://github.com/Jackie-Suv/iPro-MP上免费获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
iPro-MP: a BERT-based model to predict multiple prokaryotic promoters
Promoters, as essential cis-regulatory elements in prokaryotes, govern gene expression by mediating RNA polymerase binding through core motifs and long-range regulatory interactions, playing a pivotal role in cell metabolism and environmental adaptation. Hence, accurate identification of prokaryotic promoters is vital for understanding their biological functions. However, the existing tools for predicting prokaryotic promoters are mainly concentrated on individual model organisms, and their prediction accuracy needs to be further improved. To address these gaps, we develop iPro-MP, a transformer-based prokaryotic promoter prediction framework that we systematically evaluate across 23 phylogenetically diverse species, including both model and non-model organisms. iPro-MP utilizes a multi-head attention mechanism to capture textual information in DNA sequences and effectively learns the hidden patterns. Cross-species prediction demonstrates the necessity of constructing species-specific models. Through a series of experiments, iPro-MP shows outstanding performance, with the AUC exceeding 0.9 in 18 out of 23 species. Our novel approach to predicting prokaryotic promoters, iPro-MP, provides the superiority to other existing tools, especially in predicting non-model organisms. Finally, for the convenience of other researchers, the source code and datasets of iPro-MP are freely available at https://github.com/Jackie-Suv/iPro-MP .
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Genome Biology
Genome Biology Biochemistry, Genetics and Molecular Biology-Genetics
CiteScore
21.00
自引率
3.30%
发文量
241
审稿时长
2 months
期刊介绍: Genome Biology stands as a premier platform for exceptional research across all domains of biology and biomedicine, explored through a genomic and post-genomic lens. With an impressive impact factor of 12.3 (2022),* the journal secures its position as the 3rd-ranked research journal in the Genetics and Heredity category and the 2nd-ranked research journal in the Biotechnology and Applied Microbiology category by Thomson Reuters. Notably, Genome Biology holds the distinction of being the highest-ranked open-access journal in this category. Our dedicated team of highly trained in-house Editors collaborates closely with our esteemed Editorial Board of international experts, ensuring the journal remains on the forefront of scientific advances and community standards. Regular engagement with researchers at conferences and institute visits underscores our commitment to staying abreast of the latest developments in the field.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信