RLSuccSite: succinylation sites prediction based on reinforcement learning dynamic with balanced reward mechanism and three-peaks enhanced method for physicochemical property scores

IF 5.7 2区化学 Q1 CHEMISTRY, MULTIDISCIPLINARY

Journal of Cheminformatics Pub Date : 2025-06-02 DOI:10.1186/s13321-025-01034-z

Lun Zhu, Qingchao Zhang, Sen Yang

{"title":"RLSuccSite: succinylation sites prediction based on reinforcement learning dynamic with balanced reward mechanism and three-peaks enhanced method for physicochemical property scores","authors":"Lun Zhu, Qingchao Zhang, Sen Yang","doi":"10.1186/s13321-025-01034-z","DOIUrl":null,"url":null,"abstract":"Recent progress in computational biology has driven the development of machine learning models for predicting protein post-translational modification sites. However, challenges such as data imbalance and limited sequence-context representation continue to hinder prediction accuracy, particularly for less frequent modifications like succinylation. In this study, we propose RLSuccSite, a reinforcement learning-based framework specifically designed to predict succinylation sites by addressing the class imbalance issue via a dynamic with balanced reward mechanism. To enhance sequence feature representation, this study also introduces Three-Peaks Enhanced Method for Physicochemical Property Scores (TPEM-PPS), a physicochemical property-driven feature extraction method that incorporates position-aware scoring to reflect amino acid contributions more effectively. The code and data of RLSuccSite can be obtained from the website: https://github.com/Zhangqingchao-Ch/RLSuccSite.git . Scientific contribution This study applies reinforcement learning to protein succinylation sites prediction, introducing a dynamic with balanced reward mechanism that effectively addresses dataset imbalance. Additionally, this study proposes a novel Three-Peaks Enhanced Method for Physicochemical Scoring, which captures residue contributions with higher precision than traditional feature extraction techniques. ","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"9 1","pages":""},"PeriodicalIF":5.7000,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cheminformatics","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1186/s13321-025-01034-z","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

Recent progress in computational biology has driven the development of machine learning models for predicting protein post-translational modification sites. However, challenges such as data imbalance and limited sequence-context representation continue to hinder prediction accuracy, particularly for less frequent modifications like succinylation. In this study, we propose RLSuccSite, a reinforcement learning-based framework specifically designed to predict succinylation sites by addressing the class imbalance issue via a dynamic with balanced reward mechanism. To enhance sequence feature representation, this study also introduces Three-Peaks Enhanced Method for Physicochemical Property Scores (TPEM-PPS), a physicochemical property-driven feature extraction method that incorporates position-aware scoring to reflect amino acid contributions more effectively. The code and data of RLSuccSite can be obtained from the website: https://github.com/Zhangqingchao-Ch/RLSuccSite.git . Scientific contribution This study applies reinforcement learning to protein succinylation sites prediction, introducing a dynamic with balanced reward mechanism that effectively addresses dataset imbalance. Additionally, this study proposes a novel Three-Peaks Enhanced Method for Physicochemical Scoring, which captures residue contributions with higher precision than traditional feature extraction techniques.

查看原文本刊更多论文

RLSuccSite：基于平衡奖励机制的强化学习动态琥珀酰化位点预测和理化性质评分三峰增强法

计算生物学的最新进展推动了用于预测蛋白质翻译后修饰位点的机器学习模型的发展。然而，诸如数据不平衡和有限的序列上下文表示等挑战继续阻碍预测的准确性，特别是对于像琥珀酰化这样不太频繁的修饰。在本研究中，我们提出了RLSuccSite，这是一个基于强化学习的框架，专门用于通过动态平衡奖励机制解决类不平衡问题来预测琥珀酰化位点。为了增强序列特征表示，本研究还引入了物化属性分数的三峰增强方法（TPEM-PPS），这是一种物化属性驱动的特征提取方法，结合位置感知评分来更有效地反映氨基酸的贡献。RLSuccSite的代码和数据可从以下网站获取：https://github.com/Zhangqingchao-Ch/RLSuccSite.git。本研究将强化学习应用于蛋白质琥珀酰化位点预测，引入动态平衡奖励机制，有效解决数据集不平衡问题。此外，本研究还提出了一种新的三峰物理化学评分方法，该方法比传统的特征提取技术更精确地捕获残留贡献。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Cheminformatics CHEMISTRY, MULTIDISCIPLINARY-COMPUTER SCIENCE, INFORMATION SYSTEMS

CiteScore

14.10

自引率

7.00%

发文量

审稿时长

3 months

期刊介绍： Journal of Cheminformatics is an open access journal publishing original peer-reviewed research in all aspects of cheminformatics and molecular modelling. Coverage includes, but is not limited to: chemical information systems, software and databases, and molecular modelling, chemical structure representations and their use in structure, substructure, and similarity searching of chemical substance and chemical reaction databases, computer and molecular graphics, computer-aided molecular design, expert systems, QSAR, and data mining techniques.