High-Throughput Sequencing and Bioinformatic Analysis Reveal Presence of the Endogenous Pararetrovirus Tobacco vein clearing virus Genome in the Tomato (Solanum lycopersicum) Host Genome
{"title":"High-Throughput Sequencing and Bioinformatic Analysis Reveal Presence of the Endogenous Pararetrovirus Tobacco vein clearing virus Genome in the Tomato (Solanum lycopersicum) Host Genome","authors":"Mahmood Othman Abass, Adnan A. Lahuf","doi":"10.22268/ajpp-41.1.077084","DOIUrl":null,"url":null,"abstract":"Abass, M.O. and A.A. Lahuf. 2023. High-Throughput Sequencing and Bioinformatic Analysis Reveal Presence of the Endogenous Pararetrovirus Tobacco vein clearing virus Genome in the Tomato (Solanum lycopersicum) Host Genome. Arab Journal of Plant Protection, 41(1): 77-84. https://doi.org/10.22268/AJPP-41.1.077084 Endogenous pararetroviral sequences (EPRVs) are repetitive sequences that have been discovered mostly in the Kingdom Plantae, particularly in various species of the family Solanaceae. In this study, the draft genome of an endogenous retrovirus identified by next generation sequencing (NGS), was found integrated in the genome of Solanum lycopersicum. Results of the homology alignment revealed that the virus identified was Tobacco vein clearing virus (TVCV), is a member of the genus Solendovirus, family Caulimoviridae. It consists of a double-stranded DNA genome of 7,760 nucleotides in length. Additionally, it has four open reading frames (ORFs), which encodes Solendovirus typical conserved domains that comprise the putative coat protein (ORF1), putative cell-to-cell movement protein (ORF2), the polyprotein (ORF3), which comprises the aspartic protease, reverse transcriptase and RNase H, as well as the putative Trans-activator factor (ORF4). Sequence alignment analysis revealed that the Iraqi TVCV had 81.60% sequence identity to the INSDC Tobacco vein clearing virus (AF190123.1), that was only reported to be integrated in the genome of some species of Nicotiana spp. However, in the current study, TVCV genome was identified associated with genome of S. lycopersicum. This new fact was further verified through BLASTn analysis that confirmed the presence of TVCV genome associated with the genome of several cultivated and wild S. lycopersicum worldwide. In conclusion, the TVCV is the first EPRVs of Solendovirus members discovered in the S. lycopersicum identified from Iraq. Keywords: Tobacco vein clearing virus; Solanum lycopersicum genome, next generation sequencing, NGS","PeriodicalId":8105,"journal":{"name":"Arab Journal for Plant Protection","volume":"24 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Arab Journal for Plant Protection","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22268/ajpp-41.1.077084","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Abass, M.O. and A.A. Lahuf. 2023. High-Throughput Sequencing and Bioinformatic Analysis Reveal Presence of the Endogenous Pararetrovirus Tobacco vein clearing virus Genome in the Tomato (Solanum lycopersicum) Host Genome. Arab Journal of Plant Protection, 41(1): 77-84. https://doi.org/10.22268/AJPP-41.1.077084 Endogenous pararetroviral sequences (EPRVs) are repetitive sequences that have been discovered mostly in the Kingdom Plantae, particularly in various species of the family Solanaceae. In this study, the draft genome of an endogenous retrovirus identified by next generation sequencing (NGS), was found integrated in the genome of Solanum lycopersicum. Results of the homology alignment revealed that the virus identified was Tobacco vein clearing virus (TVCV), is a member of the genus Solendovirus, family Caulimoviridae. It consists of a double-stranded DNA genome of 7,760 nucleotides in length. Additionally, it has four open reading frames (ORFs), which encodes Solendovirus typical conserved domains that comprise the putative coat protein (ORF1), putative cell-to-cell movement protein (ORF2), the polyprotein (ORF3), which comprises the aspartic protease, reverse transcriptase and RNase H, as well as the putative Trans-activator factor (ORF4). Sequence alignment analysis revealed that the Iraqi TVCV had 81.60% sequence identity to the INSDC Tobacco vein clearing virus (AF190123.1), that was only reported to be integrated in the genome of some species of Nicotiana spp. However, in the current study, TVCV genome was identified associated with genome of S. lycopersicum. This new fact was further verified through BLASTn analysis that confirmed the presence of TVCV genome associated with the genome of several cultivated and wild S. lycopersicum worldwide. In conclusion, the TVCV is the first EPRVs of Solendovirus members discovered in the S. lycopersicum identified from Iraq. Keywords: Tobacco vein clearing virus; Solanum lycopersicum genome, next generation sequencing, NGS