Enhancing the annotation of small ORF-altering variants using MORFEE: introducing MORFEEdb, a comprehensive catalog of SNVs affecting upstream ORFs in human 5'UTRs.
Caroline Meguerditchian, David Baux, Thomas E Ludwig, Emmanuelle Genin, David-Alexandre Trégouët, Omar Soukarieh
{"title":"Enhancing the annotation of small ORF-altering variants using MORFEE: introducing MORFEEdb, a comprehensive catalog of SNVs affecting upstream ORFs in human 5'UTRs.","authors":"Caroline Meguerditchian, David Baux, Thomas E Ludwig, Emmanuelle Genin, David-Alexandre Trégouët, Omar Soukarieh","doi":"10.1093/nargab/lqaf017","DOIUrl":null,"url":null,"abstract":"<p><p>Non-canonical small open reading frames (sORFs) are among the main regulators of gene expression. The most studied of these are upstream ORFs (upORFs) located in the 5'-untranslated region (UTR) of coding genes. Internal ORFs (intORFs) in the coding sequence and downstream ORFs (dORFs) in the 3'UTR have received less attention. Different bioinformatics tools permit the prediction of single nucleotide variants (SNVs) altering upORFs, mainly those creating AUGs or deleting stop codons, but no tool predicts variants altering non-canonical translation initiation sites and those altering intORFs or dORFs. We propose an upgrade of our MORFEE bioinformatics tool to identify SNVs that may alter all types of sORFs in coding transcripts from a VCF file. Moreover, we generate an exhaustive catalog, named MORFEEdb, reporting all possible SNVs altering existing upORFs or creating new ones in human transcripts, and provide an R script for visualizing the results. MORFEEdb has been implemented in the public platform Mobidetails. Finally, the annotation of ClinVar variants with MORFEE reveals that > 45% of UTR-SNVs can alter upORFs or dORFs. In conclusion, MORFEE and MORFEEdb have the potential to improve the molecular diagnosis of rare human diseases and to facilitate the identification of functional variants from genome-wide association studies of complex traits.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 1","pages":"lqaf017"},"PeriodicalIF":4.0000,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11920869/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NAR Genomics and Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/nargab/lqaf017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
Non-canonical small open reading frames (sORFs) are among the main regulators of gene expression. The most studied of these are upstream ORFs (upORFs) located in the 5'-untranslated region (UTR) of coding genes. Internal ORFs (intORFs) in the coding sequence and downstream ORFs (dORFs) in the 3'UTR have received less attention. Different bioinformatics tools permit the prediction of single nucleotide variants (SNVs) altering upORFs, mainly those creating AUGs or deleting stop codons, but no tool predicts variants altering non-canonical translation initiation sites and those altering intORFs or dORFs. We propose an upgrade of our MORFEE bioinformatics tool to identify SNVs that may alter all types of sORFs in coding transcripts from a VCF file. Moreover, we generate an exhaustive catalog, named MORFEEdb, reporting all possible SNVs altering existing upORFs or creating new ones in human transcripts, and provide an R script for visualizing the results. MORFEEdb has been implemented in the public platform Mobidetails. Finally, the annotation of ClinVar variants with MORFEE reveals that > 45% of UTR-SNVs can alter upORFs or dORFs. In conclusion, MORFEE and MORFEEdb have the potential to improve the molecular diagnosis of rare human diseases and to facilitate the identification of functional variants from genome-wide association studies of complex traits.