George I. Austin, Aya Brown Kav, Shahd ElNaggar, Heekuk Park, Jana Biermann, Anne-Catrin Uhlemann, Itsik Pe’er, Tal Korem
{"title":"使用DEBIAS-M进行处理偏差校正可以提高基于微生物组的预测模型的交叉研究泛化","authors":"George I. Austin, Aya Brown Kav, Shahd ElNaggar, Heekuk Park, Jana Biermann, Anne-Catrin Uhlemann, Itsik Pe’er, Tal Korem","doi":"10.1038/s41564-025-01954-4","DOIUrl":null,"url":null,"abstract":"Every step in common microbiome profiling protocols has variable efficiency for each microbe, for example, different DNA extraction efficiency for Gram-positive bacteria. These processing biases impede the identification of signals that are biologically interpretable and generalizable across studies. ‘Batch-correction’ methods have been used to address these issues computationally with some success, but they are largely non-interpretable and often require the use of an outcome variable in a manner that risks overfitting. We present DEBIAS-M (domain adaptation with phenotype estimation and batch integration across studies of the microbiome), an interpretable framework for inference and correction of processing bias, which facilitates domain adaptation in microbiome studies. DEBIAS-M learns bias-correction factors for each microbe in each batch that simultaneously minimize batch effects and maximize cross-study associations with phenotypes. Using diverse benchmarks including 16S rRNA and metagenomic sequencing, classification and regression, and a variety of clinical and molecular targets, we demonstrate that using DEBIAS-M improves cross-study prediction accuracy compared with commonly used batch-correction methods. Notably, we show that the inferred bias-correction factors are stable, interpretable and strongly associated with specific experimental protocols. Overall, we show that DEBIAS-M facilitates improved modelling of microbiome data and identification of interpretable signals that generalize across studies. DEBIAS-M corrects technical variability in microbiome data in a manner both interpretable and suitable for machine learning. In extensive benchmarks, DEBIAS-M facilitates robust analyses that generalize across datasets.","PeriodicalId":18992,"journal":{"name":"Nature Microbiology","volume":"10 4","pages":"897-911"},"PeriodicalIF":20.5000,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Processing-bias correction with DEBIAS-M improves cross-study generalization of microbiome-based prediction models\",\"authors\":\"George I. Austin, Aya Brown Kav, Shahd ElNaggar, Heekuk Park, Jana Biermann, Anne-Catrin Uhlemann, Itsik Pe’er, Tal Korem\",\"doi\":\"10.1038/s41564-025-01954-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Every step in common microbiome profiling protocols has variable efficiency for each microbe, for example, different DNA extraction efficiency for Gram-positive bacteria. These processing biases impede the identification of signals that are biologically interpretable and generalizable across studies. ‘Batch-correction’ methods have been used to address these issues computationally with some success, but they are largely non-interpretable and often require the use of an outcome variable in a manner that risks overfitting. We present DEBIAS-M (domain adaptation with phenotype estimation and batch integration across studies of the microbiome), an interpretable framework for inference and correction of processing bias, which facilitates domain adaptation in microbiome studies. DEBIAS-M learns bias-correction factors for each microbe in each batch that simultaneously minimize batch effects and maximize cross-study associations with phenotypes. Using diverse benchmarks including 16S rRNA and metagenomic sequencing, classification and regression, and a variety of clinical and molecular targets, we demonstrate that using DEBIAS-M improves cross-study prediction accuracy compared with commonly used batch-correction methods. Notably, we show that the inferred bias-correction factors are stable, interpretable and strongly associated with specific experimental protocols. Overall, we show that DEBIAS-M facilitates improved modelling of microbiome data and identification of interpretable signals that generalize across studies. DEBIAS-M corrects technical variability in microbiome data in a manner both interpretable and suitable for machine learning. In extensive benchmarks, DEBIAS-M facilitates robust analyses that generalize across datasets.\",\"PeriodicalId\":18992,\"journal\":{\"name\":\"Nature Microbiology\",\"volume\":\"10 4\",\"pages\":\"897-911\"},\"PeriodicalIF\":20.5000,\"publicationDate\":\"2025-03-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Nature Microbiology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://www.nature.com/articles/s41564-025-01954-4\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MICROBIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Microbiology","FirstCategoryId":"99","ListUrlMain":"https://www.nature.com/articles/s41564-025-01954-4","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MICROBIOLOGY","Score":null,"Total":0}
Processing-bias correction with DEBIAS-M improves cross-study generalization of microbiome-based prediction models
Every step in common microbiome profiling protocols has variable efficiency for each microbe, for example, different DNA extraction efficiency for Gram-positive bacteria. These processing biases impede the identification of signals that are biologically interpretable and generalizable across studies. ‘Batch-correction’ methods have been used to address these issues computationally with some success, but they are largely non-interpretable and often require the use of an outcome variable in a manner that risks overfitting. We present DEBIAS-M (domain adaptation with phenotype estimation and batch integration across studies of the microbiome), an interpretable framework for inference and correction of processing bias, which facilitates domain adaptation in microbiome studies. DEBIAS-M learns bias-correction factors for each microbe in each batch that simultaneously minimize batch effects and maximize cross-study associations with phenotypes. Using diverse benchmarks including 16S rRNA and metagenomic sequencing, classification and regression, and a variety of clinical and molecular targets, we demonstrate that using DEBIAS-M improves cross-study prediction accuracy compared with commonly used batch-correction methods. Notably, we show that the inferred bias-correction factors are stable, interpretable and strongly associated with specific experimental protocols. Overall, we show that DEBIAS-M facilitates improved modelling of microbiome data and identification of interpretable signals that generalize across studies. DEBIAS-M corrects technical variability in microbiome data in a manner both interpretable and suitable for machine learning. In extensive benchmarks, DEBIAS-M facilitates robust analyses that generalize across datasets.
期刊介绍:
Nature Microbiology aims to cover a comprehensive range of topics related to microorganisms. This includes:
Evolution: The journal is interested in exploring the evolutionary aspects of microorganisms. This may include research on their genetic diversity, adaptation, and speciation over time.
Physiology and cell biology: Nature Microbiology seeks to understand the functions and characteristics of microorganisms at the cellular and physiological levels. This may involve studying their metabolism, growth patterns, and cellular processes.
Interactions: The journal focuses on the interactions microorganisms have with each other, as well as their interactions with hosts or the environment. This encompasses investigations into microbial communities, symbiotic relationships, and microbial responses to different environments.
Societal significance: Nature Microbiology recognizes the societal impact of microorganisms and welcomes studies that explore their practical applications. This may include research on microbial diseases, biotechnology, or environmental remediation.
In summary, Nature Microbiology is interested in research related to the evolution, physiology and cell biology of microorganisms, their interactions, and their societal relevance.