Current computational tools for protein lysine acylation site prediction.

IF 6.8 2区生物学 Q1 BIOCHEMICAL RESEARCH METHODS

Briefings in bioinformatics Pub Date : 2024-09-23 DOI:10.1093/bib/bbae469

Zhaohui Qin, Haoran Ren, Pei Zhao, Kaiyuan Wang, Huixia Liu, Chunbo Miao, Yanxiu Du, Junzhou Li, Liuji Wu, Zhen Chen

{"title":"Current computational tools for protein lysine acylation site prediction.","authors":"Zhaohui Qin, Haoran Ren, Pei Zhao, Kaiyuan Wang, Huixia Liu, Chunbo Miao, Yanxiu Du, Junzhou Li, Liuji Wu, Zhen Chen","doi":"10.1093/bib/bbae469","DOIUrl":null,"url":null,"abstract":"<p><p>As a main subtype of post-translational modification (PTM), protein lysine acylations (PLAs) play crucial roles in regulating diverse functions of proteins. With recent advancements in proteomics technology, the identification of PTM is becoming a data-rich field. A large amount of experimentally verified data is urgently required to be translated into valuable biological insights. With computational approaches, PLA can be accurately detected across the whole proteome, even for organisms with small-scale datasets. Herein, a comprehensive summary of 166 in silico PLA prediction methods is presented, including a single type of PLA site and multiple types of PLA sites. This recapitulation covers important aspects that are critical for the development of a robust predictor, including data collection and preparation, sample selection, feature representation, classification algorithm design, model evaluation, and method availability. Notably, we discuss the application of protein language models and transfer learning to solve the small-sample learning issue. We also highlight the prediction methods developed for functionally relevant PLA sites and species/substrate/cell-type-specific PLA sites. In conclusion, this systematic review could potentially facilitate the development of novel PLA predictors and offer useful insights to researchers from various disciplines.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"25 6","pages":""},"PeriodicalIF":6.8000,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11421846/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbae469","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

As a main subtype of post-translational modification (PTM), protein lysine acylations (PLAs) play crucial roles in regulating diverse functions of proteins. With recent advancements in proteomics technology, the identification of PTM is becoming a data-rich field. A large amount of experimentally verified data is urgently required to be translated into valuable biological insights. With computational approaches, PLA can be accurately detected across the whole proteome, even for organisms with small-scale datasets. Herein, a comprehensive summary of 166 in silico PLA prediction methods is presented, including a single type of PLA site and multiple types of PLA sites. This recapitulation covers important aspects that are critical for the development of a robust predictor, including data collection and preparation, sample selection, feature representation, classification algorithm design, model evaluation, and method availability. Notably, we discuss the application of protein language models and transfer learning to solve the small-sample learning issue. We also highlight the prediction methods developed for functionally relevant PLA sites and species/substrate/cell-type-specific PLA sites. In conclusion, this systematic review could potentially facilitate the development of novel PLA predictors and offer useful insights to researchers from various disciplines.

查看原文本刊更多论文

目前用于预测蛋白质赖氨酸酰化位点的计算工具。

作为蛋白质翻译后修饰（PTM）的主要亚型，蛋白质赖氨酸酰化（PLAs）在调控蛋白质的多种功能方面发挥着至关重要的作用。随着蛋白质组学技术的不断进步，PTM 的鉴定正成为一个数据丰富的领域。大量经过实验验证的数据急需转化为有价值的生物学见解。利用计算方法，PLA 可以在整个蛋白质组中进行精确检测，即使是小规模数据集的生物体也不例外。本文全面总结了 166 种硅学聚乳酸预测方法，包括单一类型的聚乳酸位点和多种类型的聚乳酸位点。这一概述涵盖了对开发稳健预测方法至关重要的重要方面，包括数据收集和准备、样本选择、特征表示、分类算法设计、模型评估和方法可用性。值得注意的是，我们讨论了蛋白质语言模型和迁移学习的应用，以解决小样本学习问题。我们还重点介绍了针对功能相关的 PLA 位点和物种/底物/细胞类型特异性 PLA 位点开发的预测方法。总之，本系统综述有可能促进新型 PLA 预测方法的开发，并为各学科的研究人员提供有用的见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Briefings in bioinformatics 生物-生化研究方法

CiteScore

13.20

自引率

13.70%

发文量

549

审稿时长

6 months

期刊介绍： Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data. The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.