利用机器学习进行宏基因组数据分析:趋势和应用。

IF 4.6 2区 生物学 Q1 MICROBIOLOGY
mSystems Pub Date : 2025-10-07 DOI:10.1128/msystems.01642-24
Shradha Sharma, Hari Priya Narahari, Karthik Raman
{"title":"利用机器学习进行宏基因组数据分析:趋势和应用。","authors":"Shradha Sharma, Hari Priya Narahari, Karthik Raman","doi":"10.1128/msystems.01642-24","DOIUrl":null,"url":null,"abstract":"<p><p>Metagenomic sequencing has revolutionized our understanding of microbial ecosystems by enabling high-resolution profiling of microbes across diverse environments. However, the resulting data are high-dimensional, sparse, and noisy, posing challenges for downstream data analysis. Machine learning (ML) has provided an arsenal of tools to extract meaningful insights from such large and complex data sets. This review surveys the existing state of ML applications in metagenomic data analysis, from traditional supervised and unsupervised learning to time-series modeling, transfer learning, and newer directions such as causal ML and generative models. We highlight certain key challenges and delve into important issues like model interpretability, emphasizing the importance of explainable AI (XAI). We also compare ML with mechanistic models, commenting on their relative advantages, disadvantages, and prospects for synergy. Finally, we preview future directions, such as the incorporation of multi-omics data, synthetic data generation, and Agentic AI systems, highlighting the increasingly prominent role that AI and ML will play in the future of microbiome science.</p>","PeriodicalId":18819,"journal":{"name":"mSystems","volume":" ","pages":"e0164224"},"PeriodicalIF":4.6000,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Harnessing machine learning for metagenomic data analysis: trends and applications.\",\"authors\":\"Shradha Sharma, Hari Priya Narahari, Karthik Raman\",\"doi\":\"10.1128/msystems.01642-24\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Metagenomic sequencing has revolutionized our understanding of microbial ecosystems by enabling high-resolution profiling of microbes across diverse environments. However, the resulting data are high-dimensional, sparse, and noisy, posing challenges for downstream data analysis. Machine learning (ML) has provided an arsenal of tools to extract meaningful insights from such large and complex data sets. This review surveys the existing state of ML applications in metagenomic data analysis, from traditional supervised and unsupervised learning to time-series modeling, transfer learning, and newer directions such as causal ML and generative models. We highlight certain key challenges and delve into important issues like model interpretability, emphasizing the importance of explainable AI (XAI). We also compare ML with mechanistic models, commenting on their relative advantages, disadvantages, and prospects for synergy. Finally, we preview future directions, such as the incorporation of multi-omics data, synthetic data generation, and Agentic AI systems, highlighting the increasingly prominent role that AI and ML will play in the future of microbiome science.</p>\",\"PeriodicalId\":18819,\"journal\":{\"name\":\"mSystems\",\"volume\":\" \",\"pages\":\"e0164224\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-10-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"mSystems\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1128/msystems.01642-24\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MICROBIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"mSystems","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1128/msystems.01642-24","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

宏基因组测序使我们能够在不同环境中对微生物进行高分辨率分析,从而彻底改变了我们对微生物生态系统的理解。然而,得到的数据是高维的、稀疏的和有噪声的,给下游的数据分析带来了挑战。机器学习(ML)提供了一系列工具,可以从如此庞大而复杂的数据集中提取有意义的见解。本文综述了机器学习在宏基因组数据分析中的应用现状,从传统的监督学习和无监督学习到时间序列建模、迁移学习,以及因果机器学习和生成模型等新方向。我们强调了某些关键挑战,并深入研究了模型可解释性等重要问题,强调了可解释AI (XAI)的重要性。我们还比较了机器学习和机械模型,评论了它们的相对优势、劣势和协同前景。最后,我们展望了未来的发展方向,如多组学数据的结合、合成数据生成和人工智能系统,强调了人工智能和机器学习将在未来微生物组科学中发挥越来越突出的作用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Harnessing machine learning for metagenomic data analysis: trends and applications.

Metagenomic sequencing has revolutionized our understanding of microbial ecosystems by enabling high-resolution profiling of microbes across diverse environments. However, the resulting data are high-dimensional, sparse, and noisy, posing challenges for downstream data analysis. Machine learning (ML) has provided an arsenal of tools to extract meaningful insights from such large and complex data sets. This review surveys the existing state of ML applications in metagenomic data analysis, from traditional supervised and unsupervised learning to time-series modeling, transfer learning, and newer directions such as causal ML and generative models. We highlight certain key challenges and delve into important issues like model interpretability, emphasizing the importance of explainable AI (XAI). We also compare ML with mechanistic models, commenting on their relative advantages, disadvantages, and prospects for synergy. Finally, we preview future directions, such as the incorporation of multi-omics data, synthetic data generation, and Agentic AI systems, highlighting the increasingly prominent role that AI and ML will play in the future of microbiome science.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
mSystems
mSystems Biochemistry, Genetics and Molecular Biology-Biochemistry
CiteScore
10.50
自引率
3.10%
发文量
308
审稿时长
13 weeks
期刊介绍: mSystems™ will publish preeminent work that stems from applying technologies for high-throughput analyses to achieve insights into the metabolic and regulatory systems at the scale of both the single cell and microbial communities. The scope of mSystems™ encompasses all important biological and biochemical findings drawn from analyses of large data sets, as well as new computational approaches for deriving these insights. mSystems™ will welcome submissions from researchers who focus on the microbiome, genomics, metagenomics, transcriptomics, metabolomics, proteomics, glycomics, bioinformatics, and computational microbiology. mSystems™ will provide streamlined decisions, while carrying on ASM''s tradition of rigorous peer review.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信