DeepCR: predicting cytokine receptor proteins through pretrained language models and deep learning networks.

IF 2.4 3区 生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY
Van The Le, Juan Peter Timothy Yuune, Thi Thu Phuong Vu, Muhammad Shahid Malik, Yu-Yen Ou
{"title":"DeepCR: predicting cytokine receptor proteins through pretrained language models and deep learning networks.","authors":"Van The Le, Juan Peter Timothy Yuune, Thi Thu Phuong Vu, Muhammad Shahid Malik, Yu-Yen Ou","doi":"10.1080/07391102.2025.2512448","DOIUrl":null,"url":null,"abstract":"<p><p>Cytokine receptors play a pivotal role in mediating the immune response and are critical in cytokine storms, which underlie the pathogenesis of conditions such as acute respiratory distress syndrome (ARDS) and autoimmune disorders. Identifying cytokine receptors is essential for understanding their biological functions, exploring therapeutic targets, and guiding clinical interventions. Traditional biochemical methods to identify cytokine receptors are labor-intensive, costly, and time-consuming, prompting the need for more efficient alternatives. Recent advances in computational biology have enabled the use of machine learning to classify cytokine receptor proteins. Most existing approaches focused on homologous features and protein composition to classify cytokine families, but no dedicated studies have been conducted on cytokine receptor proteins. This gap presents an opportunity to develop a method specifically for classifying cytokine receptors among other membrane proteins. In this study, we present a novel classification framework combining pre-trained language models (PLMs) with a multi-window convolutional neural network (mCNN) architecture for the fast and accurate identification of cytokine receptor proteins. PLMs, such as ProtTrans and ESM variants, capture biochemical context directly from raw protein sequences, while mCNN efficiently extracts local and global sequence patterns using convolutional layers with varying window sizes. Our model achieved an AUC of 0.96 in the training as well as 0.97 and 0.93 in two independent tests, demonstrating its effectiveness in distinguishing cytokine receptors from non-cytokine receptor proteins. By eliminating the need for manual feature extraction, this approach offers a robust and scalable solution for protein classification, paving the way for its application in drug discovery and understanding cytokine-mediated diseases.</p>","PeriodicalId":15272,"journal":{"name":"Journal of Biomolecular Structure & Dynamics","volume":" ","pages":"1-18"},"PeriodicalIF":2.4000,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biomolecular Structure & Dynamics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1080/07391102.2025.2512448","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Cytokine receptors play a pivotal role in mediating the immune response and are critical in cytokine storms, which underlie the pathogenesis of conditions such as acute respiratory distress syndrome (ARDS) and autoimmune disorders. Identifying cytokine receptors is essential for understanding their biological functions, exploring therapeutic targets, and guiding clinical interventions. Traditional biochemical methods to identify cytokine receptors are labor-intensive, costly, and time-consuming, prompting the need for more efficient alternatives. Recent advances in computational biology have enabled the use of machine learning to classify cytokine receptor proteins. Most existing approaches focused on homologous features and protein composition to classify cytokine families, but no dedicated studies have been conducted on cytokine receptor proteins. This gap presents an opportunity to develop a method specifically for classifying cytokine receptors among other membrane proteins. In this study, we present a novel classification framework combining pre-trained language models (PLMs) with a multi-window convolutional neural network (mCNN) architecture for the fast and accurate identification of cytokine receptor proteins. PLMs, such as ProtTrans and ESM variants, capture biochemical context directly from raw protein sequences, while mCNN efficiently extracts local and global sequence patterns using convolutional layers with varying window sizes. Our model achieved an AUC of 0.96 in the training as well as 0.97 and 0.93 in two independent tests, demonstrating its effectiveness in distinguishing cytokine receptors from non-cytokine receptor proteins. By eliminating the need for manual feature extraction, this approach offers a robust and scalable solution for protein classification, paving the way for its application in drug discovery and understanding cytokine-mediated diseases.

DeepCR:通过预训练语言模型和深度学习网络预测细胞因子受体蛋白。
细胞因子受体在介导免疫反应中起关键作用,在细胞因子风暴中起关键作用,细胞因子风暴是急性呼吸窘迫综合征(ARDS)和自身免疫性疾病等疾病发病机制的基础。识别细胞因子受体对于了解其生物学功能、探索治疗靶点和指导临床干预至关重要。传统的生物化学方法来识别细胞因子受体是劳动密集型的,昂贵的,耗时的,促使需要更有效的替代品。计算生物学的最新进展使机器学习能够对细胞因子受体蛋白进行分类。现有的方法大多集中在同源特征和蛋白质组成来分类细胞因子家族,但尚未对细胞因子受体蛋白进行专门的研究。这一差距提供了一个机会,开发一种方法,专门分类细胞因子受体之间的其他膜蛋白。在这项研究中,我们提出了一种新的分类框架,将预训练语言模型(PLMs)与多窗口卷积神经网络(mCNN)结构相结合,用于快速准确地识别细胞因子受体蛋白。plm,如ProtTrans和ESM变体,直接从原始蛋白质序列中捕获生化背景,而mCNN使用不同窗口大小的卷积层有效地提取局部和全局序列模式。我们的模型在训练中获得了0.96的AUC,在两次独立测试中获得了0.97和0.93的AUC,证明了它在区分细胞因子受体和非细胞因子受体蛋白方面的有效性。通过消除手动特征提取的需要,该方法为蛋白质分类提供了一个强大且可扩展的解决方案,为其在药物发现和理解细胞因子介导的疾病中的应用铺平了道路。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Biomolecular Structure & Dynamics
Journal of Biomolecular Structure & Dynamics 生物-生化与分子生物学
CiteScore
8.90
自引率
9.10%
发文量
597
审稿时长
2 months
期刊介绍: The Journal of Biomolecular Structure and Dynamics welcomes manuscripts on biological structure, dynamics, interactions and expression. The Journal is one of the leading publications in high end computational science, atomic structural biology, bioinformatics, virtual drug design, genomics and biological networks.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信