NeuroPred-AIMP: Multimodal Deep Learning for Neuropeptide Prediction via Protein Language Modeling and Temporal Convolutional Networks.

IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL
Jinjin Li,Shuwen Xiong,Hua Shi,Feifei Cui,Zilong Zhang,Leyi Wei
{"title":"NeuroPred-AIMP: Multimodal Deep Learning for Neuropeptide Prediction via Protein Language Modeling and Temporal Convolutional Networks.","authors":"Jinjin Li,Shuwen Xiong,Hua Shi,Feifei Cui,Zilong Zhang,Leyi Wei","doi":"10.1021/acs.jcim.5c00444","DOIUrl":null,"url":null,"abstract":"Neuropeptides are key signaling molecules that regulate fundamental physiological processes ranging from metabolism to cognitive function. However, accurate identification is a huge challenge due to sequence heterogeneity, obscured functional motifs and limited experimentally validated data. Accurate identification of neuropeptides is critical for advancing neurological disease therapeutics and peptide-based drug design. Existing neuropeptide identification methods rely on manual features combined with traditional machine learning methods, which are difficult to capture the deep patterns of sequences. To address these limitations, we propose NeuroPred-AIMP (adaptive integrated multimodal predictor), an interpretable model that synergizes global semantic representation of the protein language model (ESM) and the multiscale structural features of the temporal convolutional network (TCN). The model introduced the adaptive features fusion mechanism of residual enhancement to dynamically recalibrate feature contributions, to achieve robust integration of evolutionary and local sequence information. The experimental results demonstrated that the proposed model showed excellent comprehensive performance on the independence test set, with an accuracy of 92.3% and the AUROC of 0.974. Simultaneously, the model showed good balance in the ability to identify positive and negative samples, with a sensitivity of 92.6% and a specificity of 92.1%, with a difference of less than 0.5%. The result fully confirms the effectiveness of the multimodal features strategy in the task of neuropeptide recognition.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"51 1","pages":""},"PeriodicalIF":5.6000,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Information and Modeling ","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.jcim.5c00444","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MEDICINAL","Score":null,"Total":0}
引用次数: 0

Abstract

Neuropeptides are key signaling molecules that regulate fundamental physiological processes ranging from metabolism to cognitive function. However, accurate identification is a huge challenge due to sequence heterogeneity, obscured functional motifs and limited experimentally validated data. Accurate identification of neuropeptides is critical for advancing neurological disease therapeutics and peptide-based drug design. Existing neuropeptide identification methods rely on manual features combined with traditional machine learning methods, which are difficult to capture the deep patterns of sequences. To address these limitations, we propose NeuroPred-AIMP (adaptive integrated multimodal predictor), an interpretable model that synergizes global semantic representation of the protein language model (ESM) and the multiscale structural features of the temporal convolutional network (TCN). The model introduced the adaptive features fusion mechanism of residual enhancement to dynamically recalibrate feature contributions, to achieve robust integration of evolutionary and local sequence information. The experimental results demonstrated that the proposed model showed excellent comprehensive performance on the independence test set, with an accuracy of 92.3% and the AUROC of 0.974. Simultaneously, the model showed good balance in the ability to identify positive and negative samples, with a sensitivity of 92.6% and a specificity of 92.1%, with a difference of less than 0.5%. The result fully confirms the effectiveness of the multimodal features strategy in the task of neuropeptide recognition.
NeuroPred-AIMP:通过蛋白质语言建模和时序卷积网络进行神经肽预测的多模态深度学习。
神经肽是调节从代谢到认知功能等基本生理过程的关键信号分子。然而,由于序列异质性、模糊的功能基序和有限的实验验证数据,准确识别是一个巨大的挑战。神经肽的准确鉴定对于推进神经系统疾病治疗和基于多肽的药物设计至关重要。现有的神经肽识别方法依赖于人工特征与传统的机器学习方法相结合,难以捕捉到序列的深层模式。为了解决这些限制,我们提出了NeuroPred-AIMP(自适应集成多模态预测器),这是一个可解释的模型,它协同了蛋白质语言模型(ESM)的全局语义表示和时间卷积网络(TCN)的多尺度结构特征。该模型引入残差增强的自适应特征融合机制,对特征贡献进行动态再标定,实现进化序列信息与局部序列信息的鲁棒融合。实验结果表明,该模型在独立性测试集上具有良好的综合性能,准确率为92.3%,AUROC为0.974。同时,该模型在阳性和阴性样本的识别能力上表现出良好的平衡,灵敏度为92.6%,特异性为92.1%,差异小于0.5%。结果充分证实了多模态特征策略在神经肽识别任务中的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
9.80
自引率
10.70%
发文量
529
审稿时长
1.4 months
期刊介绍: The Journal of Chemical Information and Modeling publishes papers reporting new methodology and/or important applications in the fields of chemical informatics and molecular modeling. Specific topics include the representation and computer-based searching of chemical databases, molecular modeling, computer-aided molecular design of new materials, catalysts, or ligands, development of new computational methods or efficient algorithms for chemical software, and biopharmaceutical chemistry including analyses of biological activity and other issues related to drug discovery. Astute chemists, computer scientists, and information specialists look to this monthly’s insightful research studies, programming innovations, and software reviews to keep current with advances in this integral, multidisciplinary field. As a subscriber you’ll stay abreast of database search systems, use of graph theory in chemical problems, substructure search systems, pattern recognition and clustering, analysis of chemical and physical data, molecular modeling, graphics and natural language interfaces, bibliometric and citation analysis, and synthesis design and reactions databases.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信