与靶向PCR富集和读图定位相关的SARS-CoV-2测序伪影

IF 2.6 3区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES
PLoS ONE Pub Date : 2025-10-16 eCollection Date: 2025-01-01 DOI:10.1371/journal.pone.0334009
Kirsten Maren Ellegaard, Vithiagaran Gunalan, Raphael Sieber, Sharmin Jamshid Baig, Nicolai Balle Larsen, Marc Bennedbæk, Jonas Bybjerg-Grauholm, Leandro Andrés Escobar-Herrera, Tobias Gress, Theis Hass Thorsen, Anders Krusager, Gitte Nygaard Aasbjerg, Nour Saad Al-Tamimi, Casper Westergaard, Christina Wiid Svarrer, Morten Rasmussen, Marc Stegger
{"title":"与靶向PCR富集和读图定位相关的SARS-CoV-2测序伪影","authors":"Kirsten Maren Ellegaard, Vithiagaran Gunalan, Raphael Sieber, Sharmin Jamshid Baig, Nicolai Balle Larsen, Marc Bennedbæk, Jonas Bybjerg-Grauholm, Leandro Andrés Escobar-Herrera, Tobias Gress, Theis Hass Thorsen, Anders Krusager, Gitte Nygaard Aasbjerg, Nour Saad Al-Tamimi, Casper Westergaard, Christina Wiid Svarrer, Morten Rasmussen, Marc Stegger","doi":"10.1371/journal.pone.0334009","DOIUrl":null,"url":null,"abstract":"<p><p>Protocols and pipelines for SARS-CoV-2 genome sequencing were rapidly established when the COVID-19 outbreak was declared a pandemic. The most widely used approach for sequencing SARS-CoV-2 includes targeted enrichment by PCR, followed by shotgun sequencing and reference-based genome assembly. As the continued surveillance of SARS-CoV-2 worldwide is transitioning towards a lower level of intensity, it is timely to re-visit the sequencing protocols and pipelines established during the acute phase of the pandemic. In the current study, we have investigated the impact of primer scheme and reference genome choice by sequencing samples with multiple primer schemes (Artic V3, V4.1 and V5.3.2) and re-processing reads with multiple reference genomes. We have also analysed the temporal development in ambiguous base calls during the emergence of the BA.2.86.x variant. We found that the primers used for targeted enrichment can result in recurrent ambiguous base calls, which can accumulate rapidly in response to the emergence of a new variant. We also found examples of consistent base calling errors, associated with PCR artifacts and amplicon drop-out. Similarly, misalignments and partially mapped reads on the reference genome resulted in ambiguous base calls, as well as defining mutations being omitted from the assembly. These findings highlight some key limitations of using targeted enrichment by PCR and reference-based genome assembly for sequencing SARS-CoV-2, and the importance of continuously monitoring and updating primer schemes and bioinformatic pipelines.</p>","PeriodicalId":20189,"journal":{"name":"PLoS ONE","volume":"20 10","pages":"e0334009"},"PeriodicalIF":2.6000,"publicationDate":"2025-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12530606/pdf/","citationCount":"0","resultStr":"{\"title\":\"SARS-CoV-2 sequencing artifacts associated with targeted PCR enrichment and read mapping.\",\"authors\":\"Kirsten Maren Ellegaard, Vithiagaran Gunalan, Raphael Sieber, Sharmin Jamshid Baig, Nicolai Balle Larsen, Marc Bennedbæk, Jonas Bybjerg-Grauholm, Leandro Andrés Escobar-Herrera, Tobias Gress, Theis Hass Thorsen, Anders Krusager, Gitte Nygaard Aasbjerg, Nour Saad Al-Tamimi, Casper Westergaard, Christina Wiid Svarrer, Morten Rasmussen, Marc Stegger\",\"doi\":\"10.1371/journal.pone.0334009\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Protocols and pipelines for SARS-CoV-2 genome sequencing were rapidly established when the COVID-19 outbreak was declared a pandemic. The most widely used approach for sequencing SARS-CoV-2 includes targeted enrichment by PCR, followed by shotgun sequencing and reference-based genome assembly. As the continued surveillance of SARS-CoV-2 worldwide is transitioning towards a lower level of intensity, it is timely to re-visit the sequencing protocols and pipelines established during the acute phase of the pandemic. In the current study, we have investigated the impact of primer scheme and reference genome choice by sequencing samples with multiple primer schemes (Artic V3, V4.1 and V5.3.2) and re-processing reads with multiple reference genomes. We have also analysed the temporal development in ambiguous base calls during the emergence of the BA.2.86.x variant. We found that the primers used for targeted enrichment can result in recurrent ambiguous base calls, which can accumulate rapidly in response to the emergence of a new variant. We also found examples of consistent base calling errors, associated with PCR artifacts and amplicon drop-out. Similarly, misalignments and partially mapped reads on the reference genome resulted in ambiguous base calls, as well as defining mutations being omitted from the assembly. These findings highlight some key limitations of using targeted enrichment by PCR and reference-based genome assembly for sequencing SARS-CoV-2, and the importance of continuously monitoring and updating primer schemes and bioinformatic pipelines.</p>\",\"PeriodicalId\":20189,\"journal\":{\"name\":\"PLoS ONE\",\"volume\":\"20 10\",\"pages\":\"e0334009\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2025-10-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12530606/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PLoS ONE\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1371/journal.pone.0334009\",\"RegionNum\":3,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS ONE","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1371/journal.pone.0334009","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

当COVID-19爆发被宣布为大流行时,SARS-CoV-2基因组测序的方案和管道迅速建立起来。最广泛使用的SARS-CoV-2测序方法包括PCR靶向富集,然后是霰弹枪测序和基于参考的基因组组装。随着全球对SARS-CoV-2的持续监测正在向较低强度过渡,重新审视在大流行急性阶段建立的测序方案和管道是及时的。在本研究中,我们通过对多个引物方案(arctic V3、V4.1和V5.3.2)的样品进行测序,并对多个参考基因组的reads进行再处理,研究了引物方案和参考基因组选择的影响。我们还分析了BA.2.86出现期间模糊基调用的时间发展。x变体。我们发现用于靶向富集的引物可以导致反复的模糊碱基呼叫,这可以迅速积累以响应新变体的出现。我们还发现了与PCR伪影和扩增子脱落相关的一致碱基调用错误的例子。同样,参考基因组上的错配和部分定位的reads导致碱基调用不明确,以及定义突变被忽略。这些发现突出了利用PCR靶向富集和基于参考的基因组组装进行SARS-CoV-2测序的一些关键局限性,以及持续监测和更新引物方案和生物信息管道的重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

SARS-CoV-2 sequencing artifacts associated with targeted PCR enrichment and read mapping.

SARS-CoV-2 sequencing artifacts associated with targeted PCR enrichment and read mapping.

SARS-CoV-2 sequencing artifacts associated with targeted PCR enrichment and read mapping.

SARS-CoV-2 sequencing artifacts associated with targeted PCR enrichment and read mapping.

Protocols and pipelines for SARS-CoV-2 genome sequencing were rapidly established when the COVID-19 outbreak was declared a pandemic. The most widely used approach for sequencing SARS-CoV-2 includes targeted enrichment by PCR, followed by shotgun sequencing and reference-based genome assembly. As the continued surveillance of SARS-CoV-2 worldwide is transitioning towards a lower level of intensity, it is timely to re-visit the sequencing protocols and pipelines established during the acute phase of the pandemic. In the current study, we have investigated the impact of primer scheme and reference genome choice by sequencing samples with multiple primer schemes (Artic V3, V4.1 and V5.3.2) and re-processing reads with multiple reference genomes. We have also analysed the temporal development in ambiguous base calls during the emergence of the BA.2.86.x variant. We found that the primers used for targeted enrichment can result in recurrent ambiguous base calls, which can accumulate rapidly in response to the emergence of a new variant. We also found examples of consistent base calling errors, associated with PCR artifacts and amplicon drop-out. Similarly, misalignments and partially mapped reads on the reference genome resulted in ambiguous base calls, as well as defining mutations being omitted from the assembly. These findings highlight some key limitations of using targeted enrichment by PCR and reference-based genome assembly for sequencing SARS-CoV-2, and the importance of continuously monitoring and updating primer schemes and bioinformatic pipelines.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
PLoS ONE
PLoS ONE 生物-生物学
CiteScore
6.20
自引率
5.40%
发文量
14242
审稿时长
3.7 months
期刊介绍: PLOS ONE is an international, peer-reviewed, open-access, online publication. PLOS ONE welcomes reports on primary research from any scientific discipline. It provides: * Open-access—freely accessible online, authors retain copyright * Fast publication times * Peer review by expert, practicing researchers * Post-publication tools to indicate quality and impact * Community-based dialogue on articles * Worldwide media coverage
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信