Improved in Silico Identification of Protein-Protein Interactions Using Deep Learning Approach

IF 1.9 4区 生物学 Q4 CELL BIOLOGY
Irfan Khan, Muhammad Arif, Ali Ghulam, Somayah Albaradei, Maha A. Thafar, Apilak Worachartcheewan
{"title":"Improved in Silico Identification of Protein-Protein Interactions Using Deep Learning Approach","authors":"Irfan Khan,&nbsp;Muhammad Arif,&nbsp;Ali Ghulam,&nbsp;Somayah Albaradei,&nbsp;Maha A. Thafar,&nbsp;Apilak Worachartcheewan","doi":"10.1049/syb2.70008","DOIUrl":null,"url":null,"abstract":"<p>Protein–protein interactions (PPIs) perform significant functions in many biological activities likewise gene regulation, metabolic pathways and signal transduction. The deregulation of PPIs may cause deadly diseases, such as cancer, autoimmune, pernicious anaemia etc. Detecting PPIs can aid in elucidating the cellular process's underlying molecular mechanisms and contribute to facilitating the discovery of new proteins for the development of novel drugs. Although high-throughput wet-lab technologies have been matured to identify large scale PPI identification; however, the traditional experimental methods are costly and slow and resource intensive. To support experimental techniques, numerous computational approaches have been emerged for identifying PPIs solely from protein sequences. However, the performance of available PPI tools are unsatisfactory and gaps remain for further improvement. In this study, a novel deep learning-based model, Deep_PPI, was developed for predicting multiple species PPIs. To extract the biological features, the authors used 21D vector representing 20 kinds' native and one special amino acid residue and implemented the Keras binary profile encoding technique to formulate each residue in proteins. The binary profile use the PaddVal strategy to equalise the length of positive and negative PPIs. After extracting the features, the authors fed them into one dimension convolutional neural network to build the final prediction model. The proposed Deep_PPI model, which consider the protein pairs into two convolutional heads. Finally, the authors concatenated the two outputs were concatenated from two branches concatenated by fully connected layer. The efficiency of the proposed predictor was demonstrated both on the cross validation and tested on various species datasets, for example, that is (Human, <i>C. elegans</i>, <i>E. coli</i>, and <i>H. sapiens</i>). The proposed model surpassed both the machine-learning models and existing state-of-the-art PPI methods. The proposed Deep_PPI will serve as valuable tool in the discovery of large-scale PPIs in particular and provide insights for drugs development in general.</p>","PeriodicalId":50379,"journal":{"name":"IET Systems Biology","volume":"19 1","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/syb2.70008","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Systems Biology","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/syb2.70008","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"CELL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Protein–protein interactions (PPIs) perform significant functions in many biological activities likewise gene regulation, metabolic pathways and signal transduction. The deregulation of PPIs may cause deadly diseases, such as cancer, autoimmune, pernicious anaemia etc. Detecting PPIs can aid in elucidating the cellular process's underlying molecular mechanisms and contribute to facilitating the discovery of new proteins for the development of novel drugs. Although high-throughput wet-lab technologies have been matured to identify large scale PPI identification; however, the traditional experimental methods are costly and slow and resource intensive. To support experimental techniques, numerous computational approaches have been emerged for identifying PPIs solely from protein sequences. However, the performance of available PPI tools are unsatisfactory and gaps remain for further improvement. In this study, a novel deep learning-based model, Deep_PPI, was developed for predicting multiple species PPIs. To extract the biological features, the authors used 21D vector representing 20 kinds' native and one special amino acid residue and implemented the Keras binary profile encoding technique to formulate each residue in proteins. The binary profile use the PaddVal strategy to equalise the length of positive and negative PPIs. After extracting the features, the authors fed them into one dimension convolutional neural network to build the final prediction model. The proposed Deep_PPI model, which consider the protein pairs into two convolutional heads. Finally, the authors concatenated the two outputs were concatenated from two branches concatenated by fully connected layer. The efficiency of the proposed predictor was demonstrated both on the cross validation and tested on various species datasets, for example, that is (Human, C. elegans, E. coli, and H. sapiens). The proposed model surpassed both the machine-learning models and existing state-of-the-art PPI methods. The proposed Deep_PPI will serve as valuable tool in the discovery of large-scale PPIs in particular and provide insights for drugs development in general.

Abstract Image

利用深度学习方法改进蛋白质-蛋白质相互作用的计算机识别
蛋白质-蛋白质相互作用(PPIs)在许多生物活动中发挥重要作用,如基因调控、代谢途径和信号转导。对质子泵抑制剂的管制可能导致致命疾病,如癌症、自身免疫性疾病、恶性贫血等。检测PPIs可以帮助阐明细胞过程的潜在分子机制,并有助于促进新蛋白质的发现,以开发新药。虽然高通量湿实验室技术已经成熟,可以进行大规模的PPI鉴定;然而,传统的实验方法成本高、速度慢、资源密集。为了支持实验技术,已经出现了许多计算方法来单独从蛋白质序列中识别PPIs。然而,现有PPI工具的性能并不令人满意,仍有差距有待进一步改进。在这项研究中,开发了一种新的基于深度学习的模型Deep_PPI,用于预测多物种ppi。为了提取蛋白质的生物学特征,作者利用代表20种天然氨基酸残基和1种特殊氨基酸残基的21D载体,采用Keras二值序列编码技术对每个残基进行编码。二进制配置文件使用PaddVal策略来平衡阳性和阴性ppi的长度。提取特征后,将其输入一维卷积神经网络,构建最终的预测模型。提出了Deep_PPI模型,该模型将蛋白质对考虑为两个卷积头部。最后,作者将两个输出通过完全连接层连接的两个分支连接起来。所提出的预测器的效率在交叉验证和各种物种数据集上都得到了证明,例如(人类、秀丽隐杆线虫、大肠杆菌和智人)。所提出的模型超越了机器学习模型和现有的最先进的PPI方法。提出的Deep_PPI将成为发现大规模ppi的有价值的工具,并为一般的药物开发提供见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IET Systems Biology
IET Systems Biology 生物-数学与计算生物学
CiteScore
4.20
自引率
4.30%
发文量
17
审稿时长
>12 weeks
期刊介绍: IET Systems Biology covers intra- and inter-cellular dynamics, using systems- and signal-oriented approaches. Papers that analyse genomic data in order to identify variables and basic relationships between them are considered if the results provide a basis for mathematical modelling and simulation of cellular dynamics. Manuscripts on molecular and cell biological studies are encouraged if the aim is a systems approach to dynamic interactions within and between cells. The scope includes the following topics: Genomics, transcriptomics, proteomics, metabolomics, cells, tissue and the physiome; molecular and cellular interaction, gene, cell and protein function; networks and pathways; metabolism and cell signalling; dynamics, regulation and control; systems, signals, and information; experimental data analysis; mathematical modelling, simulation and theoretical analysis; biological modelling, simulation, prediction and control; methodologies, databases, tools and algorithms for modelling and simulation; modelling, analysis and control of biological networks; synthetic biology and bioengineering based on systems biology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信