Pushing the limits of single molecule transcript sequencing to uncover the largest disease-associated transcript isoforms in the human neural retina

bioRxiv - Genetics Pub Date : 2024-09-14 DOI:10.1101/2024.09.10.612265

Merel Stemerdink, Tabea Riepe, Nick Zomer, Renee Salz, Michael Kwint, Raoul Timmermans, Barbara Ferrari, Stefano Ferrari, Alfredo Duenas Rey, Emma Delanote, Suzanne E de Bruijn, Hannie Kremer, Susanne Roosing, Frauke Coppieters, Alexander Hoischen, Frans P.M. Cremers, Peter A.C. 't Hoen, Erwin van Wijk, Erik de Vrieze

{"title":"Pushing the limits of single molecule transcript sequencing to uncover the largest disease-associated transcript isoforms in the human neural retina","authors":"Merel Stemerdink, Tabea Riepe, Nick Zomer, Renee Salz, Michael Kwint, Raoul Timmermans, Barbara Ferrari, Stefano Ferrari, Alfredo Duenas Rey, Emma Delanote, Suzanne E de Bruijn, Hannie Kremer, Susanne Roosing, Frauke Coppieters, Alexander Hoischen, Frans P.M. Cremers, Peter A.C. 't Hoen, Erwin van Wijk, Erik de Vrieze","doi":"10.1101/2024.09.10.612265","DOIUrl":null,"url":null,"abstract":"Sequencing technologies have long limited the comprehensive investigation of large transcripts associated with inherited retinal diseases (IRDs) like Usher syndrome, which involves 11 associated genes with transcripts up to 19.6 kb. To address this, we used PacBio long-read mRNA isoform sequencing (Iso-Seq) following standard library preparation and an optimized workflow to enrich for long transcripts in the human neural retina. While our workflow achieved sequencing of transcripts up to 15 kb, this was insufficient for Usher syndrome-associated genes USH2A and ADGRV1, with transcripts of 18.9 kb and 19.6 kb, respectively. To overcome this, we employed the Samplix Xdrop System for indirect target enrichment of cDNA, a technique typically used for genomic DNA capture. This method facilitated the successful capture and sequencing of ADGRV1 transcripts as well as the full-length 18.9 kb USH2A transcripts. By combining algorithmic analysis with detailed manual curation of sequenced reads, we identified novel isoforms and alternative splicing events across the 11 Usher syndrome-associated genes, with implications for diagnostics and therapy development. Our findings demonstrate the Xdrop systems adaptability for cDNA capture and the advantages of integrating computational and manual transcript analyses. The full neural retina sequencing dataset is available via EGA under identifier EGAD50000000720.","PeriodicalId":501246,"journal":{"name":"bioRxiv - Genetics","volume":"77 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Genetics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.10.612265","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Sequencing technologies have long limited the comprehensive investigation of large transcripts associated with inherited retinal diseases (IRDs) like Usher syndrome, which involves 11 associated genes with transcripts up to 19.6 kb. To address this, we used PacBio long-read mRNA isoform sequencing (Iso-Seq) following standard library preparation and an optimized workflow to enrich for long transcripts in the human neural retina. While our workflow achieved sequencing of transcripts up to 15 kb, this was insufficient for Usher syndrome-associated genes USH2A and ADGRV1, with transcripts of 18.9 kb and 19.6 kb, respectively. To overcome this, we employed the Samplix Xdrop System for indirect target enrichment of cDNA, a technique typically used for genomic DNA capture. This method facilitated the successful capture and sequencing of ADGRV1 transcripts as well as the full-length 18.9 kb USH2A transcripts. By combining algorithmic analysis with detailed manual curation of sequenced reads, we identified novel isoforms and alternative splicing events across the 11 Usher syndrome-associated genes, with implications for diagnostics and therapy development. Our findings demonstrate the Xdrop systems adaptability for cDNA capture and the advantages of integrating computational and manual transcript analyses. The full neural retina sequencing dataset is available via EGA under identifier EGAD50000000720.

查看原文本刊更多论文

突破单分子转录本测序的极限，发现人类神经视网膜中最大的疾病相关转录本同工形式

长期以来，测序技术一直限制着对与遗传性视网膜疾病（IRD）相关的大转录本的全面研究，如Usher综合征，它涉及11个相关基因，转录本长达19.6 kb。为了解决这个问题，我们使用 PacBio 长读程 mRNA 同工酶测序（Iso-Seq），采用标准文库制备和优化的工作流程来富集人类神经视网膜中的长转录本。虽然我们的工作流程能对长达 15 kb 的转录本进行测序，但这对于乌谢尔综合征相关基因 USH2A 和 ADGRV1 来说是不够的，这两个基因的转录本分别为 18.9 kb 和 19.6 kb。为了克服这一问题，我们采用了 Samplix Xdrop 系统对 cDNA 进行间接目标富集，这是一种通常用于基因组 DNA 捕获的技术。这种方法有助于成功捕获 ADGRV1 转录本和全长 18.9 kb 的 USH2A 转录本并进行测序。通过将算法分析与对测序读数的详细手工整理相结合，我们发现了 11 个乌谢尔综合征相关基因中的新型同工酶和替代剪接事件，这对诊断和治疗开发具有重要意义。我们的研究结果证明了Xdrop系统对cDNA捕获的适应性，以及将计算和人工转录本分析相结合的优势。完整的神经视网膜测序数据集可通过 EGA 获得，标识符为 EGAD50000000720。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

bioRxiv - Genetics

自引率

0.00%

发文量