Jannatul Ferdous, Samuel Kunkleman, William Taylor, April Harris, Cynthia J Gibas, Jessica A Schlueter
{"title":"金标准数据集和评估从废水中估算血统丰度的方法。","authors":"Jannatul Ferdous, Samuel Kunkleman, William Taylor, April Harris, Cynthia J Gibas, Jessica A Schlueter","doi":"10.1016/j.scitotenv.2024.174515","DOIUrl":null,"url":null,"abstract":"<p><p>During the SARS-CoV-2 pandemic, genome-based wastewater surveillance sequencing has been a powerful tool for public health to monitor circulating and emerging viral variants. As a medium, wastewater is very complex because of its mixed matrix nature, which makes the deconvolution of wastewater samples more difficult. Here we introduce a gold standard dataset constructed from synthetic viral control mixtures of known composition, spiked into a wastewater RNA matrix and sequenced on the Oxford Nanopore Technologies platform. We compare the performance of eight of the most commonly used deconvolution tools in identifying SARS-CoV-2 variants present in these mixtures. The software evaluated was primarily chosen for its relevance to the CDC wastewater surveillance reporting protocol, which until recently employed a pipeline that incorporates results from four deconvolution methods: Freyja, kallisto, Kraken 2/Bracken, and LCS. We also tested Lollipop, a deconvolution method used by the Swiss SARS-CoV-2 Sequencing Consortium, and three additional methods not used in the C-WAP pipeline: lineagespot, Alcov, and VaQuERo. We found that the commonly used software Freyja outperformed the other CDC pipeline tools in correct identification of lineages present in the control mixtures, and that the VaQuERo method was similarly accurate, with minor differences in the ability of the two methods to avoid false negatives and suppress false positives. Our results also provide insight into the effect of the tiling primer scheme and wastewater RNA extract matrix on viral sequencing and data deconvolution outcomes.</p>","PeriodicalId":422,"journal":{"name":"Science of the Total Environment","volume":" ","pages":"174515"},"PeriodicalIF":8.0000,"publicationDate":"2024-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A gold standard dataset and evaluation of methods for lineage abundance estimation from wastewater.\",\"authors\":\"Jannatul Ferdous, Samuel Kunkleman, William Taylor, April Harris, Cynthia J Gibas, Jessica A Schlueter\",\"doi\":\"10.1016/j.scitotenv.2024.174515\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>During the SARS-CoV-2 pandemic, genome-based wastewater surveillance sequencing has been a powerful tool for public health to monitor circulating and emerging viral variants. As a medium, wastewater is very complex because of its mixed matrix nature, which makes the deconvolution of wastewater samples more difficult. Here we introduce a gold standard dataset constructed from synthetic viral control mixtures of known composition, spiked into a wastewater RNA matrix and sequenced on the Oxford Nanopore Technologies platform. We compare the performance of eight of the most commonly used deconvolution tools in identifying SARS-CoV-2 variants present in these mixtures. The software evaluated was primarily chosen for its relevance to the CDC wastewater surveillance reporting protocol, which until recently employed a pipeline that incorporates results from four deconvolution methods: Freyja, kallisto, Kraken 2/Bracken, and LCS. We also tested Lollipop, a deconvolution method used by the Swiss SARS-CoV-2 Sequencing Consortium, and three additional methods not used in the C-WAP pipeline: lineagespot, Alcov, and VaQuERo. We found that the commonly used software Freyja outperformed the other CDC pipeline tools in correct identification of lineages present in the control mixtures, and that the VaQuERo method was similarly accurate, with minor differences in the ability of the two methods to avoid false negatives and suppress false positives. Our results also provide insight into the effect of the tiling primer scheme and wastewater RNA extract matrix on viral sequencing and data deconvolution outcomes.</p>\",\"PeriodicalId\":422,\"journal\":{\"name\":\"Science of the Total Environment\",\"volume\":\" \",\"pages\":\"174515\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2024-10-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Science of the Total Environment\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://doi.org/10.1016/j.scitotenv.2024.174515\",\"RegionNum\":1,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/7/5 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science of the Total Environment","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1016/j.scitotenv.2024.174515","RegionNum":1,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/5 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
A gold standard dataset and evaluation of methods for lineage abundance estimation from wastewater.
During the SARS-CoV-2 pandemic, genome-based wastewater surveillance sequencing has been a powerful tool for public health to monitor circulating and emerging viral variants. As a medium, wastewater is very complex because of its mixed matrix nature, which makes the deconvolution of wastewater samples more difficult. Here we introduce a gold standard dataset constructed from synthetic viral control mixtures of known composition, spiked into a wastewater RNA matrix and sequenced on the Oxford Nanopore Technologies platform. We compare the performance of eight of the most commonly used deconvolution tools in identifying SARS-CoV-2 variants present in these mixtures. The software evaluated was primarily chosen for its relevance to the CDC wastewater surveillance reporting protocol, which until recently employed a pipeline that incorporates results from four deconvolution methods: Freyja, kallisto, Kraken 2/Bracken, and LCS. We also tested Lollipop, a deconvolution method used by the Swiss SARS-CoV-2 Sequencing Consortium, and three additional methods not used in the C-WAP pipeline: lineagespot, Alcov, and VaQuERo. We found that the commonly used software Freyja outperformed the other CDC pipeline tools in correct identification of lineages present in the control mixtures, and that the VaQuERo method was similarly accurate, with minor differences in the ability of the two methods to avoid false negatives and suppress false positives. Our results also provide insight into the effect of the tiling primer scheme and wastewater RNA extract matrix on viral sequencing and data deconvolution outcomes.
期刊介绍:
The Science of the Total Environment is an international journal dedicated to scientific research on the environment and its interaction with humanity. It covers a wide range of disciplines and seeks to publish innovative, hypothesis-driven, and impactful research that explores the entire environment, including the atmosphere, lithosphere, hydrosphere, biosphere, and anthroposphere.
The journal's updated Aims & Scope emphasizes the importance of interdisciplinary environmental research with broad impact. Priority is given to studies that advance fundamental understanding and explore the interconnectedness of multiple environmental spheres. Field studies are preferred, while laboratory experiments must demonstrate significant methodological advancements or mechanistic insights with direct relevance to the environment.