{"title":"VirDiG: a <i>de novo</i> transcriptome assembler for coronavirus.","authors":"Minghao Li, Xuaoyu Guo, Jin Zhao","doi":"10.1093/bioadv/vbaf075","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>The discontinuous transcription mechanism of coronaviruses contributes to their adaptation to different host environments and plays a critical role in their lifecycle. Accurate assembly of coronavirus transcripts is vital for understanding the virus's biological traits and developing precise prevention and treatment strategies. However, existing <i>de novo</i> assembly algorithms are primarily designed for alternative splicing events in eukaryotes and are not suitable for assembling coronavirus transcriptome, which consists of both genomic RNA and subgenomic mRNAs. Coronavirus transcriptome reconstruction from short reads remains a challenging problem.</p><p><strong>Results: </strong>In this work, we present VirDiG, a <i>de novo</i> transcriptome assembler specifically designed for coronaviruses. VirDiG utilizes a discontinuous graph to facilitate accurate transcript assembly by incorporating information from paired-end reads, sequence depth, and start and stop codons. Experimental results from both simulated and real datasets show that VirDiG exhibits significant advantages in reconstructing the transcriptome of coronaviruses when compared to traditional <i>de novo</i> assemblers tailored for classical eukaryotic transcriptome assembly.</p><p><strong>Availability and implementation: </strong>VirDiG is freely available at https://github.com/Limh616/VirDiG.git.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf075"},"PeriodicalIF":2.4000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12034387/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics advances","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioadv/vbaf075","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Motivation: The discontinuous transcription mechanism of coronaviruses contributes to their adaptation to different host environments and plays a critical role in their lifecycle. Accurate assembly of coronavirus transcripts is vital for understanding the virus's biological traits and developing precise prevention and treatment strategies. However, existing de novo assembly algorithms are primarily designed for alternative splicing events in eukaryotes and are not suitable for assembling coronavirus transcriptome, which consists of both genomic RNA and subgenomic mRNAs. Coronavirus transcriptome reconstruction from short reads remains a challenging problem.
Results: In this work, we present VirDiG, a de novo transcriptome assembler specifically designed for coronaviruses. VirDiG utilizes a discontinuous graph to facilitate accurate transcript assembly by incorporating information from paired-end reads, sequence depth, and start and stop codons. Experimental results from both simulated and real datasets show that VirDiG exhibits significant advantages in reconstructing the transcriptome of coronaviruses when compared to traditional de novo assemblers tailored for classical eukaryotic transcriptome assembly.
Availability and implementation: VirDiG is freely available at https://github.com/Limh616/VirDiG.git.