Liana Alves de Oliveira , Gabriela Canalli Kretzschmar , Sara Cristina Lobo-Alves , Giovanna Nazaré De Barros Prezia , Saloe Bispo , Roberto Rosati
{"title":"The widespread misuse of StringTie’s gene identifier tags as de facto gene symbols does not allow consistent gene identification in published research","authors":"Liana Alves de Oliveira , Gabriela Canalli Kretzschmar , Sara Cristina Lobo-Alves , Giovanna Nazaré De Barros Prezia , Saloe Bispo , Roberto Rosati","doi":"10.1016/j.gene.2025.149440","DOIUrl":null,"url":null,"abstract":"<div><div>Current RNA sequencing techniques allow the characterization of novel genes and transcript isoforms, with StringTie being an effective and widely used tool for transcript assembly. In merge mode, StringTie assigns identifiers to putative genes, prefixed with “MSTRG” by default. These tags are sometimes used by authors as the designated name for genes or transcripts of high relevance in their work, even though they are unique and unambiguous only within the context of one analysis. In addition, when such identifiers are used, detailed genomic information is essential to identify the genes of interest clearly. In this report, we examined 161 studies that referred to a gene or transcript by its StringTie identifier in their title or abstract. Of these, only 41% provided sufficient information to clearly identify the gene(s) of interest and allow the reproducibility of the results and their comparison to further independent research. In light of this, we offer recommendations to address this issue and improve reporting standards.</div></div>","PeriodicalId":12499,"journal":{"name":"Gene","volume":"954 ","pages":"Article 149440"},"PeriodicalIF":2.6000,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Gene","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378111925002288","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
Current RNA sequencing techniques allow the characterization of novel genes and transcript isoforms, with StringTie being an effective and widely used tool for transcript assembly. In merge mode, StringTie assigns identifiers to putative genes, prefixed with “MSTRG” by default. These tags are sometimes used by authors as the designated name for genes or transcripts of high relevance in their work, even though they are unique and unambiguous only within the context of one analysis. In addition, when such identifiers are used, detailed genomic information is essential to identify the genes of interest clearly. In this report, we examined 161 studies that referred to a gene or transcript by its StringTie identifier in their title or abstract. Of these, only 41% provided sufficient information to clearly identify the gene(s) of interest and allow the reproducibility of the results and their comparison to further independent research. In light of this, we offer recommendations to address this issue and improve reporting standards.
期刊介绍:
Gene publishes papers that focus on the regulation, expression, function and evolution of genes in all biological contexts, including all prokaryotic and eukaryotic organisms, as well as viruses.