Jakob Wirbel, Tessa M Andermann, Erin F Brooks, Lanya Evans, Adam Groth, Mai Dvorak, Meenakshi Chakraborty, Bianca Palushaj, Gabriella Z M Reynolds, Imani E Porter, Monzr Al Malki, Andrew Rezvani, Mahasweta Gooptu, Hany Elmariah, Lyndsey Runaas, Teng Fei, Michael J Martens, Javier Bolaños-Meade, Mehdi Hamadani, Shernan Holtan, Rob Jenq, Jonathan U Peled, Mary M Horowitz, Kathleen L Poston, Wael Saber, Leslie S Kean, Miguel-Angel Perales, Ami S Bhatt
{"title":"Accurate prediction of absolute prokaryotic abundance from DNA concentration.","authors":"Jakob Wirbel, Tessa M Andermann, Erin F Brooks, Lanya Evans, Adam Groth, Mai Dvorak, Meenakshi Chakraborty, Bianca Palushaj, Gabriella Z M Reynolds, Imani E Porter, Monzr Al Malki, Andrew Rezvani, Mahasweta Gooptu, Hany Elmariah, Lyndsey Runaas, Teng Fei, Michael J Martens, Javier Bolaños-Meade, Mehdi Hamadani, Shernan Holtan, Rob Jenq, Jonathan U Peled, Mary M Horowitz, Kathleen L Poston, Wael Saber, Leslie S Kean, Miguel-Angel Perales, Ami S Bhatt","doi":"10.1016/j.crmeth.2025.101030","DOIUrl":null,"url":null,"abstract":"<p><p>Quantification of the absolute microbial abundance in a human stool sample is crucial for a comprehensive understanding of the microbial ecosystem, but this information is lost upon metagenomic sequencing. While several methods exist to measure absolute microbial abundance, they are technically challenging and costly, presenting an opportunity for machine learning. Here, we observe a strong correlation between DNA concentration and the absolute number of 16S ribosomal RNA copies as measured by digital droplet PCR in clinical stool samples from individuals undergoing hematopoietic cell transplantation (BMT CTN 1801). Based on this correlation and additional measurements, we trained an accurate yet simple machine learning model for the prediction of absolute prokaryotic load, which showed exceptional prediction accuracy on an external cohort that includes people living with Parkinson's disease and healthy controls. We propose that, with further validation, this model has the potential to enable accurate absolute abundance estimation based on readily available sample measurements.</p>","PeriodicalId":29773,"journal":{"name":"Cell Reports Methods","volume":" ","pages":"101030"},"PeriodicalIF":4.3000,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cell Reports Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.crmeth.2025.101030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/28 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Quantification of the absolute microbial abundance in a human stool sample is crucial for a comprehensive understanding of the microbial ecosystem, but this information is lost upon metagenomic sequencing. While several methods exist to measure absolute microbial abundance, they are technically challenging and costly, presenting an opportunity for machine learning. Here, we observe a strong correlation between DNA concentration and the absolute number of 16S ribosomal RNA copies as measured by digital droplet PCR in clinical stool samples from individuals undergoing hematopoietic cell transplantation (BMT CTN 1801). Based on this correlation and additional measurements, we trained an accurate yet simple machine learning model for the prediction of absolute prokaryotic load, which showed exceptional prediction accuracy on an external cohort that includes people living with Parkinson's disease and healthy controls. We propose that, with further validation, this model has the potential to enable accurate absolute abundance estimation based on readily available sample measurements.