Ming-Ren Yen, Ya-Ru Li, Chia-Yi Cheng, Ting-Ying Wu, Ming-Jung Liu
{"title":"TISCalling: leveraging machine learning to identify translational initiation sites in plants and viruses.","authors":"Ming-Ren Yen, Ya-Ru Li, Chia-Yi Cheng, Ting-Ying Wu, Ming-Jung Liu","doi":"10.1007/s11103-025-01632-3","DOIUrl":null,"url":null,"abstract":"<p><p>The recognition of translational initiation sites (TISs) offers complementary insights into identifying genes encoding novel proteins or small peptides. Conventional computational methods primarily identify Ribo-seq-supported TISs and lack the capacity of systematic and global identification of TIS, especially for non-AUG sites in plants. Additionally, these methods are often unsuitable for evaluating the importance of mRNA sequence features for TIS determination. In this study, we present TISCalling, a robust framework that combines machine learning (ML) models and statistical analysis to identify and rank novel TISs across eukaryotes. TISCalling generalized and ranks important features common to multiple plant and mammalian species while identifying kingdom-specific features such as mRNA secondary structures and \"G\"-nucleotide contents. Furthermore, TISCalling achieved high predictive power for identifying novel viral TISs. Importantly, TISCalling provides prediction scores for putative TIS along plant transcripts, enabling prioritization of those of interest for further validation. We offer TISCalling as a command-line-based package [ https://github.com/yenmr/TISCalling ], capable of generating prediction models and identifying key sequence features. Additionally, we provide web tools [ https://predict.southerngenomics.org/TISCalling/ ] for visualizing pre-computed potential TISs, making it accessible to users without programming experience. The TISCalling framework offers a sequence-aware and interpretable approach for decoding genome sequences and exploring functional proteins in plants and viruses.</p>","PeriodicalId":20064,"journal":{"name":"Plant Molecular Biology","volume":"115 4","pages":"102"},"PeriodicalIF":3.8000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12316744/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Plant Molecular Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1007/s11103-025-01632-3","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The recognition of translational initiation sites (TISs) offers complementary insights into identifying genes encoding novel proteins or small peptides. Conventional computational methods primarily identify Ribo-seq-supported TISs and lack the capacity of systematic and global identification of TIS, especially for non-AUG sites in plants. Additionally, these methods are often unsuitable for evaluating the importance of mRNA sequence features for TIS determination. In this study, we present TISCalling, a robust framework that combines machine learning (ML) models and statistical analysis to identify and rank novel TISs across eukaryotes. TISCalling generalized and ranks important features common to multiple plant and mammalian species while identifying kingdom-specific features such as mRNA secondary structures and "G"-nucleotide contents. Furthermore, TISCalling achieved high predictive power for identifying novel viral TISs. Importantly, TISCalling provides prediction scores for putative TIS along plant transcripts, enabling prioritization of those of interest for further validation. We offer TISCalling as a command-line-based package [ https://github.com/yenmr/TISCalling ], capable of generating prediction models and identifying key sequence features. Additionally, we provide web tools [ https://predict.southerngenomics.org/TISCalling/ ] for visualizing pre-computed potential TISs, making it accessible to users without programming experience. The TISCalling framework offers a sequence-aware and interpretable approach for decoding genome sequences and exploring functional proteins in plants and viruses.
期刊介绍:
Plant Molecular Biology is an international journal dedicated to rapid publication of original research articles in all areas of plant biology.The Editorial Board welcomes full-length manuscripts that address important biological problems of broad interest, including research in comparative genomics, functional genomics, proteomics, bioinformatics, computational biology, biochemical and regulatory networks, and biotechnology. Because space in the journal is limited, however, preference is given to publication of results that provide significant new insights into biological problems and that advance the understanding of structure, function, mechanisms, or regulation. Authors must ensure that results are of high quality and that manuscripts are written for a broad plant science audience.