An innovative peptide toxicity prediction model based on multi-scale convolutional neural network and residual connection.

IF 5.4
Shengli Zhang, Jingyi Ren, Yunyun Liang
{"title":"An innovative peptide toxicity prediction model based on multi-scale convolutional neural network and residual connection.","authors":"Shengli Zhang, Jingyi Ren, Yunyun Liang","doi":"10.1093/bioinformatics/btaf462","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Peptide toxicity is a critical concern in the development of peptide-based therapeutics, as toxic peptides can lead to severe side effects, including organ damage, immune reactions, and cytotoxicity. Predicting peptide toxicity accurately is essential to ensure the safety and efficacy of these drugs.</p><p><strong>Results: </strong>In this study, we propose a novel model, ToxMSRC, to predict peptide toxicity using a combination of the continuous bag of words (CBOW) method from word2vec, synthetic minority over-sampling technique (SMOTE), multi-scale convolutional neural networks (CNN), and bidirectional long short-term memory (BiLSTM). This approach addresses the challenge of data imbalance by augmenting positive samples and improves feature extraction through multi-scale convolution. Furthermore, the model incorporates a residual connection that helps prevent overfitting and enhances generalization ability, improving classification performance. The model is evaluated on benchmark and independent test sets, achieving BACC scores of 92.17% on independent test1 and 86.89% on independent test2, outperforming existing state-of-the-art models. Additionally, ToxMSRC provides valuable insights into the relationship between peptide toxicity and amino acid sequences, demonstrating its potential and practical value in peptide-based drug development.</p><p><strong>Availability and implementation: </strong>The complete datasets, source code, and pre-trained models are made available at https://github.com/Renjingyi123/ToxMSRC and https://doi.org/10.5281/zenodo.15668530.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf462","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Motivation: Peptide toxicity is a critical concern in the development of peptide-based therapeutics, as toxic peptides can lead to severe side effects, including organ damage, immune reactions, and cytotoxicity. Predicting peptide toxicity accurately is essential to ensure the safety and efficacy of these drugs.

Results: In this study, we propose a novel model, ToxMSRC, to predict peptide toxicity using a combination of the continuous bag of words (CBOW) method from word2vec, synthetic minority over-sampling technique (SMOTE), multi-scale convolutional neural networks (CNN), and bidirectional long short-term memory (BiLSTM). This approach addresses the challenge of data imbalance by augmenting positive samples and improves feature extraction through multi-scale convolution. Furthermore, the model incorporates a residual connection that helps prevent overfitting and enhances generalization ability, improving classification performance. The model is evaluated on benchmark and independent test sets, achieving BACC scores of 92.17% on independent test1 and 86.89% on independent test2, outperforming existing state-of-the-art models. Additionally, ToxMSRC provides valuable insights into the relationship between peptide toxicity and amino acid sequences, demonstrating its potential and practical value in peptide-based drug development.

Availability and implementation: The complete datasets, source code, and pre-trained models are made available at https://github.com/Renjingyi123/ToxMSRC and https://doi.org/10.5281/zenodo.15668530.

基于多尺度卷积神经网络和残差连接的多肽毒性预测模型。
动机:肽毒性是基于肽的治疗方法发展中的一个关键问题,因为有毒肽可导致严重的副作用,包括器官损伤、免疫反应和细胞毒性。准确预测肽毒性是保证这些药物安全性和有效性的关键。结果:在本研究中,我们提出了一个新的模型ToxMSRC,该模型结合了word2vec的连续词袋(CBOW)方法、合成少数过采样技术(SMOTE)、多尺度卷积神经网络(CNN)和双向长短期记忆(BiLSTM)来预测肽毒性。该方法通过增加正样本来解决数据不平衡的问题,并通过多尺度卷积改进特征提取。此外,该模型还引入了残差连接,防止了过拟合,增强了泛化能力,提高了分类性能。在基准测试集和独立测试集上对模型进行了评估,独立测试1的BACC得分为92.17%,独立测试2的BACC得分为86.89%,优于现有最先进的模型。此外,ToxMSRC对肽毒性与氨基酸序列之间的关系提供了有价值的见解,展示了其在肽基药物开发中的潜力和实用价值。可用性:完整的数据集、源代码和预训练模型可在https://github.com/Renjingyi123/ToxMSRC和https://doi.org/10.5281/zenodo.15668530.Supplementary上获得:补充数据可在Bioinformatics在线获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信