Processing-bias correction with DEBIAS-M improves cross-study generalization of microbiome-based prediction models

IF 20.5 1区 生物学 Q1 MICROBIOLOGY
George I. Austin, Aya Brown Kav, Shahd ElNaggar, Heekuk Park, Jana Biermann, Anne-Catrin Uhlemann, Itsik Pe’er, Tal Korem
{"title":"Processing-bias correction with DEBIAS-M improves cross-study generalization of microbiome-based prediction models","authors":"George I. Austin, Aya Brown Kav, Shahd ElNaggar, Heekuk Park, Jana Biermann, Anne-Catrin Uhlemann, Itsik Pe’er, Tal Korem","doi":"10.1038/s41564-025-01954-4","DOIUrl":null,"url":null,"abstract":"Every step in common microbiome profiling protocols has variable efficiency for each microbe, for example, different DNA extraction efficiency for Gram-positive bacteria. These processing biases impede the identification of signals that are biologically interpretable and generalizable across studies. ‘Batch-correction’ methods have been used to address these issues computationally with some success, but they are largely non-interpretable and often require the use of an outcome variable in a manner that risks overfitting. We present DEBIAS-M (domain adaptation with phenotype estimation and batch integration across studies of the microbiome), an interpretable framework for inference and correction of processing bias, which facilitates domain adaptation in microbiome studies. DEBIAS-M learns bias-correction factors for each microbe in each batch that simultaneously minimize batch effects and maximize cross-study associations with phenotypes. Using diverse benchmarks including 16S rRNA and metagenomic sequencing, classification and regression, and a variety of clinical and molecular targets, we demonstrate that using DEBIAS-M improves cross-study prediction accuracy compared with commonly used batch-correction methods. Notably, we show that the inferred bias-correction factors are stable, interpretable and strongly associated with specific experimental protocols. Overall, we show that DEBIAS-M facilitates improved modelling of microbiome data and identification of interpretable signals that generalize across studies. DEBIAS-M corrects technical variability in microbiome data in a manner both interpretable and suitable for machine learning. In extensive benchmarks, DEBIAS-M facilitates robust analyses that generalize across datasets.","PeriodicalId":18992,"journal":{"name":"Nature Microbiology","volume":"10 4","pages":"897-911"},"PeriodicalIF":20.5000,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Microbiology","FirstCategoryId":"99","ListUrlMain":"https://www.nature.com/articles/s41564-025-01954-4","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Every step in common microbiome profiling protocols has variable efficiency for each microbe, for example, different DNA extraction efficiency for Gram-positive bacteria. These processing biases impede the identification of signals that are biologically interpretable and generalizable across studies. ‘Batch-correction’ methods have been used to address these issues computationally with some success, but they are largely non-interpretable and often require the use of an outcome variable in a manner that risks overfitting. We present DEBIAS-M (domain adaptation with phenotype estimation and batch integration across studies of the microbiome), an interpretable framework for inference and correction of processing bias, which facilitates domain adaptation in microbiome studies. DEBIAS-M learns bias-correction factors for each microbe in each batch that simultaneously minimize batch effects and maximize cross-study associations with phenotypes. Using diverse benchmarks including 16S rRNA and metagenomic sequencing, classification and regression, and a variety of clinical and molecular targets, we demonstrate that using DEBIAS-M improves cross-study prediction accuracy compared with commonly used batch-correction methods. Notably, we show that the inferred bias-correction factors are stable, interpretable and strongly associated with specific experimental protocols. Overall, we show that DEBIAS-M facilitates improved modelling of microbiome data and identification of interpretable signals that generalize across studies. DEBIAS-M corrects technical variability in microbiome data in a manner both interpretable and suitable for machine learning. In extensive benchmarks, DEBIAS-M facilitates robust analyses that generalize across datasets.

Abstract Image

Abstract Image

使用DEBIAS-M进行处理偏差校正可以提高基于微生物组的预测模型的交叉研究泛化
普通微生物组分析方案的每一步对每种微生物都有不同的效率,例如革兰氏阳性细菌的DNA提取效率不同。这些处理偏差阻碍了识别具有生物学可解释性和可概括性的信号。“批校正”方法已被用于解决这些计算问题,并取得了一些成功,但它们在很大程度上是不可解释的,并且通常需要以一种有过拟合风险的方式使用结果变量。我们提出了DEBIAS-M(跨微生物组研究的表型估计和批量集成的结构域适应),这是一个可解释的推理和纠正加工偏差的框架,有助于微生物组研究中的结构域适应。DEBIAS-M学习每个批次中每个微生物的偏差校正因子,同时最大限度地减少批次效应并最大限度地提高与表型的交叉研究关联。使用不同的基准,包括16S rRNA和宏基因组测序,分类和回归,以及各种临床和分子靶标,我们证明与常用的批量校正方法相比,使用DEBIAS-M提高了交叉研究预测的准确性。值得注意的是,我们表明推断的偏差校正因子是稳定的,可解释的,并且与特定的实验方案密切相关。总体而言,我们表明DEBIAS-M有助于改进微生物组数据的建模和识别可解释的信号,这些信号在研究中普遍存在。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Nature Microbiology
Nature Microbiology Immunology and Microbiology-Microbiology
CiteScore
44.40
自引率
1.10%
发文量
226
期刊介绍: Nature Microbiology aims to cover a comprehensive range of topics related to microorganisms. This includes: Evolution: The journal is interested in exploring the evolutionary aspects of microorganisms. This may include research on their genetic diversity, adaptation, and speciation over time. Physiology and cell biology: Nature Microbiology seeks to understand the functions and characteristics of microorganisms at the cellular and physiological levels. This may involve studying their metabolism, growth patterns, and cellular processes. Interactions: The journal focuses on the interactions microorganisms have with each other, as well as their interactions with hosts or the environment. This encompasses investigations into microbial communities, symbiotic relationships, and microbial responses to different environments. Societal significance: Nature Microbiology recognizes the societal impact of microorganisms and welcomes studies that explore their practical applications. This may include research on microbial diseases, biotechnology, or environmental remediation. In summary, Nature Microbiology is interested in research related to the evolution, physiology and cell biology of microorganisms, their interactions, and their societal relevance.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信