Full-length PacBio Amplicon Sequencing to Unveil RNA Editing Sites

IF 4.6 Q2 MATERIALS SCIENCE, BIOMATERIALS
Xiao-lu Zhu, Ming-ling Liao, Ya-Jie Zhu, Yun‐wei Dong
{"title":"Full-length PacBio Amplicon Sequencing to Unveil RNA Editing Sites","authors":"Xiao-lu Zhu, Ming-ling Liao, Ya-Jie Zhu, Yun‐wei Dong","doi":"10.2174/1574893618666230803112142","DOIUrl":null,"url":null,"abstract":"\n\nRNA editing enriches post-transcriptional sequence changes. Currently detecting RNA editing sites is mostly based on the Sanger sequencing platform and second-generation sequencing. However, detection with Sanger sequencing is limited by the disturbing background peaks using the direct sequencing method and the clone number using the clone sequencing method, while second-generation sequencing detection is constrained by its short read.\n\n\n\nWe aimed to design a pipeline that can accurately detect RNA editing sites for full-length long-read amplicons to meet the requirement when focusing on a few specific genes of interest.\n\n\n\nWe developed a novel high-throughput RNA editing sites detection pipeline based on the PacBio circular consensus sequences sequencing which is accurate with high-throughput and long-read coverage. We tested the pipeline on cytosolic malate dehydrogenase in the hard-shelled mussel Mytilus coruscus and further validated it using direct Sanger sequencing.\n\n\n\nData generated from the PacBio circular consensus sequences (CCS) amplicons in three mussels were first filtered by quality and then selected by open reading frame. After filtering, 225-2047 sequences of the three mussels, respectively, were used to identify RNA editing sites. With corresponding genomic DNA sequences, we extracted 227-799 candidate RNA editing sites excluding heterozygous sites. We further figured out 7-11 final RESs using a new error model specially designed for RNA editing site detection. The resulting RNA editing sites all agree with the validation using the Sanger sequencing.\n\n\n\nWe report a near-zero error rate method in identifying RNA editing sites of long-read amplicons with the use of PacBio CCS sequencing.\n","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.2174/1574893618666230803112142","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
引用次数: 0

Abstract

RNA editing enriches post-transcriptional sequence changes. Currently detecting RNA editing sites is mostly based on the Sanger sequencing platform and second-generation sequencing. However, detection with Sanger sequencing is limited by the disturbing background peaks using the direct sequencing method and the clone number using the clone sequencing method, while second-generation sequencing detection is constrained by its short read. We aimed to design a pipeline that can accurately detect RNA editing sites for full-length long-read amplicons to meet the requirement when focusing on a few specific genes of interest. We developed a novel high-throughput RNA editing sites detection pipeline based on the PacBio circular consensus sequences sequencing which is accurate with high-throughput and long-read coverage. We tested the pipeline on cytosolic malate dehydrogenase in the hard-shelled mussel Mytilus coruscus and further validated it using direct Sanger sequencing. Data generated from the PacBio circular consensus sequences (CCS) amplicons in three mussels were first filtered by quality and then selected by open reading frame. After filtering, 225-2047 sequences of the three mussels, respectively, were used to identify RNA editing sites. With corresponding genomic DNA sequences, we extracted 227-799 candidate RNA editing sites excluding heterozygous sites. We further figured out 7-11 final RESs using a new error model specially designed for RNA editing site detection. The resulting RNA editing sites all agree with the validation using the Sanger sequencing. We report a near-zero error rate method in identifying RNA editing sites of long-read amplicons with the use of PacBio CCS sequencing.
全长PacBio扩增子测序揭示RNA编辑位点
RNA编辑丰富了转录后序列的变化。目前检测RNA编辑位点主要基于Sanger测序平台和第二代测序。然而,Sanger测序的检测受到使用直接测序方法的干扰背景峰和使用克隆测序方法的克隆数量的限制,而第二代测序检测受到其短读数的限制。我们旨在设计一种管道,可以准确检测全长长读扩增子的RNA编辑位点,以满足关注少数感兴趣的特定基因的要求。我们开发了一种基于PacBio循环共有序列测序的新型高通量RNA编辑位点检测流水线,该流水线具有高通量和长读覆盖率。我们在硬壳贻贝Mytilus coruscus中测试了细胞溶质苹果酸脱氢酶,并使用直接Sanger测序进一步验证了这一点。从三种贻贝中的PacBio循环共有序列(CCS)扩增子产生的数据首先通过质量过滤,然后通过开放阅读框进行选择。过滤后,分别使用三种贻贝的225-2047个序列来鉴定RNA编辑位点。利用相应的基因组DNA序列,我们提取了227-799个候选RNA编辑位点,不包括杂合位点。我们使用专门为RNA编辑位点检测设计的新误差模型进一步计算出7-11个最终RES。得到的RNA编辑位点都与使用Sanger测序的验证一致。我们报道了一种使用PacBio-CCS测序识别长读扩增子RNA编辑位点的接近零错误率方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
ACS Applied Bio Materials
ACS Applied Bio Materials Chemistry-Chemistry (all)
CiteScore
9.40
自引率
2.10%
发文量
464
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信