优化的 SMRT-UMI 方案可从不同人群中生成高度精确的序列数据集--应用于 HIV-1 类群

IF 5.5 2区医学 Q1 VIROLOGY

Virus Evolution Pub Date : 2024-04-04 DOI:10.1093/ve/veae019

Dylan H Westfall, Wenjie Deng, Alec Pankow, Lennie Chen, Hong Zhao, Carolyn Williamson, Morgane Rolland, Ben Murrell, James I Mullins

{"title":"优化的 SMRT-UMI 方案可从不同人群中生成高度精确的序列数据集--应用于 HIV-1 类群","authors":"Dylan H Westfall, Wenjie Deng, Alec Pankow, Lennie Chen, Hong Zhao, Carolyn Williamson, Morgane Rolland, Ben Murrell, James I Mullins","doi":"10.1093/ve/veae019","DOIUrl":null,"url":null,"abstract":"Pathogen diversity resulting in quasispecies can enable persistence and adaptation to host defenses and therapies. However, accurate quasispecies characterization can be impeded by errors introduced during sample handling and sequencing, which can require extensive optimizations to overcome. We present complete laboratory and bioinformatics workflows to overcome many of these hurdles. The Pacific Biosciences single molecule real-time platform was used to sequence polymerase-chain reaction (PCR) amplicons derived from cDNA templates tagged with unique molecular identifiers (SMRT-UMI). Optimized laboratory protocols were developed through extensive testing of different sample preparation conditions to minimize between-template recombination during PCR. The use of UMI allowed accurate template quantitation as well as removal of point mutations introduced during PCR and sequencing to produce a highly accurate consensus sequence from each template. Production of highly accurate sequences from the large datasets produced from SMRT-UMI sequencing is facilitated by a novel bioinformatic pipeline, Probabilistic Offspring Resolver for Primer IDs (PORPIDpipeline). PORPIDpipeline automatically filters and parses circular consensus reads by sample, identifies and discards reads with UMIs likely created from PCR and sequencing errors, generates consensus sequences, checks for contamination within the dataset, and removes any sequence with evidence of PCR recombination, heteroduplex formation, or early cycle PCR errors. The optimized SMRT-UMI sequencing and PORPIDpipeline methods presented here represent a highly adaptable and established starting point for accurate sequencing of diverse pathogens. These methods are illustrated through characterization of human immunodeficiency virus quasispecies in a virus transmitter-recipient pair of individuals.","PeriodicalId":56026,"journal":{"name":"Virus Evolution","volume":"121 1","pages":""},"PeriodicalIF":5.5000,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimized SMRT-UMI protocol produces highly accurate sequence datasets from diverse populations—Application to HIV-1 quasispecies\",\"authors\":\"Dylan H Westfall, Wenjie Deng, Alec Pankow, Lennie Chen, Hong Zhao, Carolyn Williamson, Morgane Rolland, Ben Murrell, James I Mullins\",\"doi\":\"10.1093/ve/veae019\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Pathogen diversity resulting in quasispecies can enable persistence and adaptation to host defenses and therapies. However, accurate quasispecies characterization can be impeded by errors introduced during sample handling and sequencing, which can require extensive optimizations to overcome. We present complete laboratory and bioinformatics workflows to overcome many of these hurdles. The Pacific Biosciences single molecule real-time platform was used to sequence polymerase-chain reaction (PCR) amplicons derived from cDNA templates tagged with unique molecular identifiers (SMRT-UMI). Optimized laboratory protocols were developed through extensive testing of different sample preparation conditions to minimize between-template recombination during PCR. The use of UMI allowed accurate template quantitation as well as removal of point mutations introduced during PCR and sequencing to produce a highly accurate consensus sequence from each template. Production of highly accurate sequences from the large datasets produced from SMRT-UMI sequencing is facilitated by a novel bioinformatic pipeline, Probabilistic Offspring Resolver for Primer IDs (PORPIDpipeline). PORPIDpipeline automatically filters and parses circular consensus reads by sample, identifies and discards reads with UMIs likely created from PCR and sequencing errors, generates consensus sequences, checks for contamination within the dataset, and removes any sequence with evidence of PCR recombination, heteroduplex formation, or early cycle PCR errors. The optimized SMRT-UMI sequencing and PORPIDpipeline methods presented here represent a highly adaptable and established starting point for accurate sequencing of diverse pathogens. These methods are illustrated through characterization of human immunodeficiency virus quasispecies in a virus transmitter-recipient pair of individuals.\",\"PeriodicalId\":56026,\"journal\":{\"name\":\"Virus Evolution\",\"volume\":\"121 1\",\"pages\":\"\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2024-04-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Virus Evolution\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1093/ve/veae019\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"VIROLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Virus Evolution","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/ve/veae019","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"VIROLOGY","Score":null,"Total":0}

引用次数: 0

摘要

病原体多样性导致的类物种可使病原体持续存在并适应宿主防御和疗法。然而，样本处理和测序过程中产生的误差可能会阻碍准确的类物种鉴定，这需要大量的优化工作才能克服。我们将介绍完整的实验室和生物信息学工作流程，以克服其中的许多障碍。我们使用太平洋生物科学公司的单分子实时平台，对来自标记有独特分子标识符（SMRT-UMI）的 cDNA 模板的聚合酶链反应（PCR）扩增子进行测序。通过对不同样品制备条件的广泛测试，制定了优化的实验室方案，以尽量减少 PCR 过程中模板间的重组。使用 UMI 可以进行精确的模板定量，并去除 PCR 和测序过程中引入的点突变，从而从每个模板中生成高精度的共识序列。新型生物信息学管道--引物 ID 的概率子代解析器（PORPIDpipeline）--有助于从 SMRT-UMI 测序产生的大型数据集中生成高精度序列。PORPIDpipeline 可按样本自动过滤和解析循环共识读数，识别并丢弃可能因 PCR 和测序错误而产生 UMI 的读数，生成共识序列，检查数据集中的污染情况，并删除任何有证据表明存在 PCR 重组、异质双链形成或早期循环 PCR 错误的序列。本文介绍的优化 SMRT-UMI 测序和 PORPID 管道方法是对各种病原体进行精确测序的高度适应性和成熟的起点。本文通过对一对病毒传播者和接受者中人类免疫缺陷病毒类群的特征描述来说明这些方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Optimized SMRT-UMI protocol produces highly accurate sequence datasets from diverse populations—Application to HIV-1 quasispecies

Pathogen diversity resulting in quasispecies can enable persistence and adaptation to host defenses and therapies. However, accurate quasispecies characterization can be impeded by errors introduced during sample handling and sequencing, which can require extensive optimizations to overcome. We present complete laboratory and bioinformatics workflows to overcome many of these hurdles. The Pacific Biosciences single molecule real-time platform was used to sequence polymerase-chain reaction (PCR) amplicons derived from cDNA templates tagged with unique molecular identifiers (SMRT-UMI). Optimized laboratory protocols were developed through extensive testing of different sample preparation conditions to minimize between-template recombination during PCR. The use of UMI allowed accurate template quantitation as well as removal of point mutations introduced during PCR and sequencing to produce a highly accurate consensus sequence from each template. Production of highly accurate sequences from the large datasets produced from SMRT-UMI sequencing is facilitated by a novel bioinformatic pipeline, Probabilistic Offspring Resolver for Primer IDs (PORPIDpipeline). PORPIDpipeline automatically filters and parses circular consensus reads by sample, identifies and discards reads with UMIs likely created from PCR and sequencing errors, generates consensus sequences, checks for contamination within the dataset, and removes any sequence with evidence of PCR recombination, heteroduplex formation, or early cycle PCR errors. The optimized SMRT-UMI sequencing and PORPIDpipeline methods presented here represent a highly adaptable and established starting point for accurate sequencing of diverse pathogens. These methods are illustrated through characterization of human immunodeficiency virus quasispecies in a virus transmitter-recipient pair of individuals.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Virus Evolution Immunology and Microbiology-Microbiology

CiteScore

10.50

自引率

5.70%

发文量

108

审稿时长

14 weeks

期刊介绍： Virus Evolution is a new Open Access journal focusing on the long-term evolution of viruses, viruses as a model system for studying evolutionary processes, viral molecular epidemiology and environmental virology. The aim of the journal is to provide a forum for original research papers, reviews, commentaries and a venue for in-depth discussion on the topics relevant to virus evolution.