Decoding oxygen preference: Machine learning discovers functional genes in Bacteria.

IF 3 2区 生物学 Q2 BIOTECHNOLOGY & APPLIED MICROBIOLOGY
Genomics Pub Date : 2025-09-01 Epub Date: 2025-08-06 DOI:10.1016/j.ygeno.2025.111095
Siqi Wan, Haida Liu, Geyi Zhu, Yuanming Geng, Wenhao Li, Lijuan Chen, Yunhua Zhang, Guomin Han
{"title":"Decoding oxygen preference: Machine learning discovers functional genes in Bacteria.","authors":"Siqi Wan, Haida Liu, Geyi Zhu, Yuanming Geng, Wenhao Li, Lijuan Chen, Yunhua Zhang, Guomin Han","doi":"10.1016/j.ygeno.2025.111095","DOIUrl":null,"url":null,"abstract":"<p><p>Predicting bacterial oxygen preference and identifying associated genes is critical in microbiology. This study developed a machine learning model using genomic features to predict bacterial oxygen preference and discover potential functional genes. Trained on a dataset of 1813 bacterial genomes, a Random Forest model achieved 90.62 % accuracy in predicting oxygen preference, outperforming prior methods. Feature analysis pinpointed key protein domains and candidate genes. Experimental overexpression of model-identified genes (encoding SOD, SAM radical enzyme, GCV-T, FDH domains) in Escherichia coli enhanced growth under aerobic conditions, validating their role in oxygen adaptation. Applying the model to rumen metagenomes revealed a predominantly anaerobic community. This work establishes machine learning as an effective strategy for bacterial oxygen preference prediction and functional gene identification, offering a novel strategy and tool for in-depth understanding of bacterial oxygen adaptation mechanisms, discovering key functional genes, and efficient exploration of uncultured microbial resources.</p>","PeriodicalId":12521,"journal":{"name":"Genomics","volume":" ","pages":"111095"},"PeriodicalIF":3.0000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genomics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.ygeno.2025.111095","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/8/6 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Predicting bacterial oxygen preference and identifying associated genes is critical in microbiology. This study developed a machine learning model using genomic features to predict bacterial oxygen preference and discover potential functional genes. Trained on a dataset of 1813 bacterial genomes, a Random Forest model achieved 90.62 % accuracy in predicting oxygen preference, outperforming prior methods. Feature analysis pinpointed key protein domains and candidate genes. Experimental overexpression of model-identified genes (encoding SOD, SAM radical enzyme, GCV-T, FDH domains) in Escherichia coli enhanced growth under aerobic conditions, validating their role in oxygen adaptation. Applying the model to rumen metagenomes revealed a predominantly anaerobic community. This work establishes machine learning as an effective strategy for bacterial oxygen preference prediction and functional gene identification, offering a novel strategy and tool for in-depth understanding of bacterial oxygen adaptation mechanisms, discovering key functional genes, and efficient exploration of uncultured microbial resources.

解码氧偏好:机器学习发现细菌的功能基因。
预测细菌的氧偏好和鉴定相关基因在微生物学中是至关重要的。本研究开发了一种机器学习模型,利用基因组特征来预测细菌的氧偏好并发现潜在的功能基因。随机森林模型在1813个细菌基因组数据集上进行训练,预测氧气偏好的准确率达到90.62 %,优于先前的方法。特征分析确定了关键的蛋白结构域和候选基因。在实验中,模型鉴定的基因(编码SOD、SAM自由基酶、GCV-T、FDH结构域)在大肠杆菌中过表达促进了有氧条件下的生长,验证了它们在氧适应中的作用。将该模型应用于瘤胃宏基因组,发现其主要为厌氧群落。本工作确立了机器学习作为细菌氧偏好预测和功能基因鉴定的有效策略,为深入了解细菌氧适应机制、发现关键功能基因、高效探索非培养微生物资源提供了新的策略和工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Genomics
Genomics 生物-生物工程与应用微生物
CiteScore
9.60
自引率
2.30%
发文量
260
审稿时长
60 days
期刊介绍: Genomics is a forum for describing the development of genome-scale technologies and their application to all areas of biological investigation. As a journal that has evolved with the field that carries its name, Genomics focuses on the development and application of cutting-edge methods, addressing fundamental questions with potential interest to a wide audience. Our aim is to publish the highest quality research and to provide authors with rapid, fair and accurate review and publication of manuscripts falling within our scope.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信