TISCalling: leveraging machine learning to identify translational initiation sites in plants and viruses.

IF 3.8 2区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY
Ming-Ren Yen, Ya-Ru Li, Chia-Yi Cheng, Ting-Ying Wu, Ming-Jung Liu
{"title":"TISCalling: leveraging machine learning to identify translational initiation sites in plants and viruses.","authors":"Ming-Ren Yen, Ya-Ru Li, Chia-Yi Cheng, Ting-Ying Wu, Ming-Jung Liu","doi":"10.1007/s11103-025-01632-3","DOIUrl":null,"url":null,"abstract":"<p><p>The recognition of translational initiation sites (TISs) offers complementary insights into identifying genes encoding novel proteins or small peptides. Conventional computational methods primarily identify Ribo-seq-supported TISs and lack the capacity of systematic and global identification of TIS, especially for non-AUG sites in plants. Additionally, these methods are often unsuitable for evaluating the importance of mRNA sequence features for TIS determination. In this study, we present TISCalling, a robust framework that combines machine learning (ML) models and statistical analysis to identify and rank novel TISs across eukaryotes. TISCalling generalized and ranks important features common to multiple plant and mammalian species while identifying kingdom-specific features such as mRNA secondary structures and \"G\"-nucleotide contents. Furthermore, TISCalling achieved high predictive power for identifying novel viral TISs. Importantly, TISCalling provides prediction scores for putative TIS along plant transcripts, enabling prioritization of those of interest for further validation. We offer TISCalling as a command-line-based package [ https://github.com/yenmr/TISCalling ], capable of generating prediction models and identifying key sequence features. Additionally, we provide web tools [ https://predict.southerngenomics.org/TISCalling/ ] for visualizing pre-computed potential TISs, making it accessible to users without programming experience. The TISCalling framework offers a sequence-aware and interpretable approach for decoding genome sequences and exploring functional proteins in plants and viruses.</p>","PeriodicalId":20064,"journal":{"name":"Plant Molecular Biology","volume":"115 4","pages":"102"},"PeriodicalIF":3.8000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12316744/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Plant Molecular Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1007/s11103-025-01632-3","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

The recognition of translational initiation sites (TISs) offers complementary insights into identifying genes encoding novel proteins or small peptides. Conventional computational methods primarily identify Ribo-seq-supported TISs and lack the capacity of systematic and global identification of TIS, especially for non-AUG sites in plants. Additionally, these methods are often unsuitable for evaluating the importance of mRNA sequence features for TIS determination. In this study, we present TISCalling, a robust framework that combines machine learning (ML) models and statistical analysis to identify and rank novel TISs across eukaryotes. TISCalling generalized and ranks important features common to multiple plant and mammalian species while identifying kingdom-specific features such as mRNA secondary structures and "G"-nucleotide contents. Furthermore, TISCalling achieved high predictive power for identifying novel viral TISs. Importantly, TISCalling provides prediction scores for putative TIS along plant transcripts, enabling prioritization of those of interest for further validation. We offer TISCalling as a command-line-based package [ https://github.com/yenmr/TISCalling ], capable of generating prediction models and identifying key sequence features. Additionally, we provide web tools [ https://predict.southerngenomics.org/TISCalling/ ] for visualizing pre-computed potential TISs, making it accessible to users without programming experience. The TISCalling framework offers a sequence-aware and interpretable approach for decoding genome sequences and exploring functional proteins in plants and viruses.

利用机器学习来识别植物和病毒的翻译起始位点。
翻译起始位点(TISs)的识别为鉴定编码新蛋白质或小肽的基因提供了补充的见解。传统的计算方法主要识别ribo -seq支持的TIS,缺乏系统和全局识别TIS的能力,特别是对于植物中非aug位点。此外,这些方法往往不适合评估mRNA序列特征对TIS测定的重要性。在这项研究中,我们提出了一个强大的框架,将机器学习(ML)模型和统计分析相结合,以识别真核生物中的新型tisling并对其进行排名。TISCalling对多种植物和哺乳动物物种共同的重要特征进行了归纳和排序,同时确定了特定领域的特征,如mRNA二级结构和“G”核苷酸含量。此外,TISCalling在识别新型病毒性TISs方面具有很高的预测能力。重要的是,TISCalling提供了沿植物转录本推测的TIS的预测分数,使那些感兴趣的优先级进一步验证。我们提供TISCalling作为一个基于命令行的包[https://github.com/yenmr/TISCalling],能够生成预测模型并识别关键序列特征。此外,我们提供了web工具[https://predict.southerngenomics.org/TISCalling/]来可视化预先计算的潜在TISs,使没有编程经验的用户也可以访问它。TISCalling框架为解码基因组序列和探索植物和病毒中的功能蛋白提供了一种序列感知和可解释的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Plant Molecular Biology
Plant Molecular Biology 生物-生化与分子生物学
自引率
2.00%
发文量
95
审稿时长
1.4 months
期刊介绍: Plant Molecular Biology is an international journal dedicated to rapid publication of original research articles in all areas of plant biology.The Editorial Board welcomes full-length manuscripts that address important biological problems of broad interest, including research in comparative genomics, functional genomics, proteomics, bioinformatics, computational biology, biochemical and regulatory networks, and biotechnology. Because space in the journal is limited, however, preference is given to publication of results that provide significant new insights into biological problems and that advance the understanding of structure, function, mechanisms, or regulation. Authors must ensure that results are of high quality and that manuscripts are written for a broad plant science audience.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信