A DEEP LEARNING MODEL IMPLEMENTATION OF TABNET FOR PREDICTING PEPTIDE-PROTEIN INTERACTION IN CANCER

Q4 Engineering
Hanif Aditya Pradana, Ahmad Ardra Damarjati, Isman Kurniawan, W. Kusuma
{"title":"A DEEP LEARNING MODEL IMPLEMENTATION OF TABNET FOR PREDICTING PEPTIDE-PROTEIN INTERACTION IN CANCER","authors":"Hanif Aditya Pradana, Ahmad Ardra Damarjati, Isman Kurniawan, W. Kusuma","doi":"10.21817/indjcse/2024/v15i1/241501032","DOIUrl":null,"url":null,"abstract":"Cancer has become one of the deadliest diseases in the world, mainly caused by the accumulation of somatic and inherited mutations. However, this phenomenon can be traced back to the molecular level, specifically, to proteins. Proteins are molecules responsible for various bioprocesses in the human body through their interactions with other molecules. Abnormalities in these interactions can lead to various undesirable outcomes, including disease and cancer. Peptides have the potential to serve as molecules that can be used in protein interactions to treat cancer. However, identification of peptides corresponding to target proteins in the laboratory is time-consuming and expensive. Therefore, there is a need for computational methods to aid identification. TabNet, a deep learning-based computational method was used in this study. For comparison purposes, we selected techniques from ensemble learning, including Random Forest and Extreme Gradient Boosting, along with methods from deep learning such as Convolutional Neural Network and Stacked Autoencoder-Deep Neural Network. Predictions are performed on a multi-feature peptide-protein interaction dataset, and the features include position-specific scoring matrices, intrinsic disorder, amino acid sequence, and physicochemical properties. Among our selected metrics, we found that TabNet achieved a better score in AUC of 0.7 and lower false negatives compared to other models.","PeriodicalId":52250,"journal":{"name":"Indian Journal of Computer Science and Engineering","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Indian Journal of Computer Science and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21817/indjcse/2024/v15i1/241501032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Engineering","Score":null,"Total":0}
引用次数: 0

Abstract

Cancer has become one of the deadliest diseases in the world, mainly caused by the accumulation of somatic and inherited mutations. However, this phenomenon can be traced back to the molecular level, specifically, to proteins. Proteins are molecules responsible for various bioprocesses in the human body through their interactions with other molecules. Abnormalities in these interactions can lead to various undesirable outcomes, including disease and cancer. Peptides have the potential to serve as molecules that can be used in protein interactions to treat cancer. However, identification of peptides corresponding to target proteins in the laboratory is time-consuming and expensive. Therefore, there is a need for computational methods to aid identification. TabNet, a deep learning-based computational method was used in this study. For comparison purposes, we selected techniques from ensemble learning, including Random Forest and Extreme Gradient Boosting, along with methods from deep learning such as Convolutional Neural Network and Stacked Autoencoder-Deep Neural Network. Predictions are performed on a multi-feature peptide-protein interaction dataset, and the features include position-specific scoring matrices, intrinsic disorder, amino acid sequence, and physicochemical properties. Among our selected metrics, we found that TabNet achieved a better score in AUC of 0.7 and lower false negatives compared to other models.
用于预测癌症中肽与蛋白质相互作用的 tabnet 深度学习模型实现
癌症已成为世界上最致命的疾病之一,其主要原因是体细胞和遗传突变的累积。然而,这种现象可以追溯到分子层面,具体来说就是蛋白质。蛋白质是通过与其他分子相互作用来负责人体内各种生物过程的分子。这些相互作用的异常会导致各种不良后果,包括疾病和癌症。肽有可能成为蛋白质相互作用中用于治疗癌症的分子。然而,在实验室中鉴定与目标蛋白质相对应的多肽既耗时又昂贵。因此,需要用计算方法来帮助识别。本研究采用了基于深度学习的计算方法 TabNet。为了便于比较,我们选择了包括随机森林(Random Forest)和极端梯度提升(Extreme Gradient Boosting)在内的集合学习技术,以及卷积神经网络(Convolutional Neural Network)和堆叠自动编码器-深度神经网络(Stacked Autoencoder-Deep Neural Network)等深度学习方法。预测是在多特征肽-蛋白质相互作用数据集上进行的,特征包括特定位置评分矩阵、内在无序性、氨基酸序列和理化性质。在我们选择的指标中,我们发现与其他模型相比,TabNet 的 AUC 得分更高,达到 0.7,假阴性更低。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Indian Journal of Computer Science and Engineering
Indian Journal of Computer Science and Engineering Engineering-Engineering (miscellaneous)
自引率
0.00%
发文量
146
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信