B-229 Interlaboratory study using spike-ins and control samples to assess sample extraction and sequencing biases for metagenomics workflows

IF 6.3 2区医学 Q1 MEDICAL LABORATORY TECHNOLOGY

Clinical chemistry Pub Date : 2025-10-02 DOI:10.1093/clinchem/hvaf086.617

Jason Kralj, Stephanie Servetas, Samuel Forry, Monique Hunter, Scott Jackson

{"title":"B-229 Interlaboratory study using spike-ins and control samples to assess sample extraction and sequencing biases for metagenomics workflows","authors":"Jason Kralj, Stephanie Servetas, Samuel Forry, Monique Hunter, Scott Jackson","doi":"10.1093/clinchem/hvaf086.617","DOIUrl":null,"url":null,"abstract":"Background Sequencing bias remains a major obstacle to comparing metagenomic sample analyses and data reuse. Methodological variables impact the data, causing significant challenges to interpretating results from otherwise similar studies. However, controls in the form of spike-ins and common samples provide mechanisms for comparing between workflows to characterize biases. These allow methodological discrepancies to be resolved ahead of interlaboratory studies, and/or provide data to potentially reconcile differences from individual workflows. We initiated a small two-part interlaboratory study (ILS) to examine two questions: (a) does DNA extraction (5 methods) impact apparent sample composition; and (b) do DNA library/sequencing protocols (3 methods) have biases? Methods For ILS-a, participants were given 8 total samples (4x sample #1, 1x samples #2-#5) consisting of human stool spiked with a mixture of whole cells (200k/uL total of S. aureus, S. enterica, E. coli, L. monocytogenes, P. aeruginosa) and DNA internal controls (34k genome/uL ea. A. hydrophila & L. pneumophila). Participants extracted the DNA and returned the samples to NIST for sequencing. For ILS-b, participants were given 8 total samples consisting of DNA extracted from the same human stool samples in (a), spiked with DNA internal controls at ∼50k copy/uL/strain and spike-ins at ∼75k copy/ul/strain (see above). Labs generated DNA libraries, sequenced the samples, and returned the fastq files to NIST for processing. For (a) and (b), kraken2 was used to taxonomically classify the reads, reporting at the genus level. Relative abundance (reads / total reads) and Normalized abundance (reads / internal control reads) were used to examine the spike-ins and native taxa across the 5 samples. Results ILS-a (extraction) showed significant extraction bias between no change and 5-fold, with the spike-ins and native taxa mimicking similar trends in Gram +/- behavior. ILS-b (DNA) also showed significant bias vs. genome GC-content from different DNA library preparations (see Figure). These biases were reproducible between labs. Within-lab reproducibility of the 4 sample #1 replicates was 10-16% (a) and 9-18% (b), and the spike-in controls’ normalized abundances were consistent within lab across the 5 samples. This showed that the biases were sample composition-independent, and the biases were both reproducible and systematic. Conclusion Spike-ins and common-sample controls elucidate biases (and harmonization) between workflows, and indicate where data will likely have comparability challenges. The biases observed with the spike-ins were similar to the native taxa, such that a small number of well-characterized organisms helped account for biases across many native taxa. Hence, even small numbers of spike-ins provide a useful tool for assessing method bias, and indicate when more thorough method characterization may improve data intercomparability.","PeriodicalId":10690,"journal":{"name":"Clinical chemistry","volume":"72 1","pages":""},"PeriodicalIF":6.3000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical chemistry","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/clinchem/hvaf086.617","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICAL LABORATORY TECHNOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Background Sequencing bias remains a major obstacle to comparing metagenomic sample analyses and data reuse. Methodological variables impact the data, causing significant challenges to interpretating results from otherwise similar studies. However, controls in the form of spike-ins and common samples provide mechanisms for comparing between workflows to characterize biases. These allow methodological discrepancies to be resolved ahead of interlaboratory studies, and/or provide data to potentially reconcile differences from individual workflows. We initiated a small two-part interlaboratory study (ILS) to examine two questions: (a) does DNA extraction (5 methods) impact apparent sample composition; and (b) do DNA library/sequencing protocols (3 methods) have biases? Methods For ILS-a, participants were given 8 total samples (4x sample #1, 1x samples #2-#5) consisting of human stool spiked with a mixture of whole cells (200k/uL total of S. aureus, S. enterica, E. coli, L. monocytogenes, P. aeruginosa) and DNA internal controls (34k genome/uL ea. A. hydrophila & L. pneumophila). Participants extracted the DNA and returned the samples to NIST for sequencing. For ILS-b, participants were given 8 total samples consisting of DNA extracted from the same human stool samples in (a), spiked with DNA internal controls at ∼50k copy/uL/strain and spike-ins at ∼75k copy/ul/strain (see above). Labs generated DNA libraries, sequenced the samples, and returned the fastq files to NIST for processing. For (a) and (b), kraken2 was used to taxonomically classify the reads, reporting at the genus level. Relative abundance (reads / total reads) and Normalized abundance (reads / internal control reads) were used to examine the spike-ins and native taxa across the 5 samples. Results ILS-a (extraction) showed significant extraction bias between no change and 5-fold, with the spike-ins and native taxa mimicking similar trends in Gram +/- behavior. ILS-b (DNA) also showed significant bias vs. genome GC-content from different DNA library preparations (see Figure). These biases were reproducible between labs. Within-lab reproducibility of the 4 sample #1 replicates was 10-16% (a) and 9-18% (b), and the spike-in controls’ normalized abundances were consistent within lab across the 5 samples. This showed that the biases were sample composition-independent, and the biases were both reproducible and systematic. Conclusion Spike-ins and common-sample controls elucidate biases (and harmonization) between workflows, and indicate where data will likely have comparability challenges. The biases observed with the spike-ins were similar to the native taxa, such that a small number of well-characterized organisms helped account for biases across many native taxa. Hence, even small numbers of spike-ins provide a useful tool for assessing method bias, and indicate when more thorough method characterization may improve data intercomparability.

查看原文本刊更多论文

B-229实验室间研究，使用尖刺和对照样本来评估宏基因组学工作流程的样本提取和测序偏差

测序偏倚仍然是比较宏基因组样本分析和数据重用的主要障碍。方法变量影响数据，对解释其他类似研究的结果造成重大挑战。然而，以尖峰输入和公共样本形式的控制提供了在工作流之间进行比较以表征偏差的机制。这允许在实验室间研究之前解决方法上的差异，和/或提供数据来潜在地调和各个工作流程的差异。我们发起了一个小型的两部分实验室间研究（ILS）来检验两个问题：(a) DNA提取（5种方法）是否影响样品的表观组成；(b) DNA文库/测序方案（3种方法）是否存在偏差？对于il -a，参与者获得了8份样品（4份样品#1,1份样品#2-#5），由人粪便加全细胞混合物（金黄色葡萄球菌、肠球菌、大肠杆菌、单核增生乳杆菌、铜绿假单胞菌共200k/uL）和DNA内部对照（34k基因组/uL，如嗜水乳杆菌和嗜肺乳杆菌）组成。参与者提取DNA并将样本返回NIST进行测序。对于il -b，参与者被给予8个样品，其中包括从(a)中相同的人类粪便样本中提取的DNA，在~ 50k拷贝/uL/菌株和~ 75k拷贝/uL/菌株的DNA内部对照中添加（见上图）。实验室生成DNA文库，对样本进行测序，并将fastq文件返回给NIST进行处理。对于(a)和(b)，利用kraken2对reads进行分类，在属水平上报告。利用相对丰度（reads / total reads）和归一化丰度（Normalized abundance）（reads / internal control reads）对5个样本的尖峰点和本地分类群进行了分析。结果il -a（提取）在无变化和5倍之间表现出显著的提取偏差，尖刺和本地分类群在Gram +/-行为上模仿了相似的趋势。il -b （DNA）也与不同DNA文库制备的基因组gc含量存在显著偏差（见图）。这些偏差在实验室之间是可重复的。4个样品1重复的实验室重复性为10-16% (a)和9-18% (b)， 5个样品的峰值对照标准化丰度在实验室内是一致的。这表明偏倚与样本组成无关，并且偏倚具有可重复性和系统性。峰值输入和共同样本控制阐明了工作流之间的偏差（和协调），并指出数据可能存在可比性挑战的地方。用尖刺观察到的偏差与本地分类群相似，因此，少数特征良好的生物有助于解释许多本地分类群的偏差。因此，即使少量的尖峰输入也为评估方法偏差提供了有用的工具，并表明何时更彻底的方法表征可以提高数据的可比性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Clinical chemistry 医学-医学实验技术

CiteScore

11.30

自引率

4.30%

发文量

212

审稿时长

1.7 months

期刊介绍： Clinical Chemistry is a peer-reviewed scientific journal that is the premier publication for the science and practice of clinical laboratory medicine. It was established in 1955 and is associated with the Association for Diagnostics & Laboratory Medicine (ADLM). The journal focuses on laboratory diagnosis and management of patients, and has expanded to include other clinical laboratory disciplines such as genomics, hematology, microbiology, and toxicology. It also publishes articles relevant to clinical specialties including cardiology, endocrinology, gastroenterology, genetics, immunology, infectious diseases, maternal-fetal medicine, neurology, nutrition, oncology, and pediatrics. In addition to original research, editorials, and reviews, Clinical Chemistry features recurring sections such as clinical case studies, perspectives, podcasts, and Q&A articles. It has the highest impact factor among journals of clinical chemistry, laboratory medicine, pathology, analytical chemistry, transfusion medicine, and clinical microbiology. The journal is indexed in databases such as MEDLINE and Web of Science.