Deep semi-supervised learning for DTI prediction using large datasets and H2O-spark platform

Meriem Bahi, M. Batouche
{"title":"Deep semi-supervised learning for DTI prediction using large datasets and H2O-spark platform","authors":"Meriem Bahi, M. Batouche","doi":"10.1109/ISACV.2018.8354081","DOIUrl":null,"url":null,"abstract":"Drug repositioning is the process of recycling existing drugs for new indications by identifying the potential drug-target interactions (DTIs). However, in silico predicting new associations between drugs and target proteins is a challenging issue, due to the scarcity of known DTIs and no experimentally true negative drug-target interaction sample. Furthermore, the volume of genomic sequences and chemical structures data is growing in an exponential manner, which consumes relatively too much time and effort. For these reasons, we propose a new computational method based on deep semi-supervised learning called DSSL-DTIs to accurately predict new DTI in post-genome era using large datasets and Spark-H2O platform. Firstly, we use the stacked autoencoders to convert high-dimensional features to low-dimensional representations. Then, we apply another unsupervised stacked autoencoders model for initializing the weights of a supervised deep neural network model. Comparing to other state-of-the-art methods applied all on the same reference dataset of Drug-Bank, it is found that our approach outperforms these techniques with an overall accuracy performance more than 98%. The DSSL-DTIs can be further used to predict large-scale new drug-target interactions. The highly ranked candidate DTIs obtained from DSSL-DTIs are also confirmed in the DrugBank database and in the literature, which demonstrates the effectiveness of our method.","PeriodicalId":184662,"journal":{"name":"2018 International Conference on Intelligent Systems and Computer Vision (ISCV)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Intelligent Systems and Computer Vision (ISCV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISACV.2018.8354081","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

Drug repositioning is the process of recycling existing drugs for new indications by identifying the potential drug-target interactions (DTIs). However, in silico predicting new associations between drugs and target proteins is a challenging issue, due to the scarcity of known DTIs and no experimentally true negative drug-target interaction sample. Furthermore, the volume of genomic sequences and chemical structures data is growing in an exponential manner, which consumes relatively too much time and effort. For these reasons, we propose a new computational method based on deep semi-supervised learning called DSSL-DTIs to accurately predict new DTI in post-genome era using large datasets and Spark-H2O platform. Firstly, we use the stacked autoencoders to convert high-dimensional features to low-dimensional representations. Then, we apply another unsupervised stacked autoencoders model for initializing the weights of a supervised deep neural network model. Comparing to other state-of-the-art methods applied all on the same reference dataset of Drug-Bank, it is found that our approach outperforms these techniques with an overall accuracy performance more than 98%. The DSSL-DTIs can be further used to predict large-scale new drug-target interactions. The highly ranked candidate DTIs obtained from DSSL-DTIs are also confirmed in the DrugBank database and in the literature, which demonstrates the effectiveness of our method.
基于大数据集和H2O-spark平台的深度半监督学习DTI预测
药物重新定位是通过识别潜在的药物-靶标相互作用(DTIs)来回收现有药物用于新适应症的过程。然而,在计算机上预测药物和靶标蛋白之间的新关联是一个具有挑战性的问题,因为已知的dti缺乏,并且没有实验上真正的阴性药物-靶标相互作用样本。此外,基因组序列和化学结构的数据量呈指数增长,这消耗了相对过多的时间和精力。基于这些原因,我们提出了一种基于深度半监督学习的计算方法dssl -DTI,利用大数据集和Spark-H2O平台准确预测后基因组时代的新DTI。首先,我们使用堆叠式自编码器将高维特征转换为低维特征。然后,我们应用另一种无监督堆叠自编码器模型来初始化监督深度神经网络模型的权值。与其他最先进的方法相比,我们的方法在药物银行的相同参考数据集上的总体准确率超过98%,优于这些技术。DSSL-DTIs可以进一步用于预测大规模的新药物-靶点相互作用。从dssl - dti中获得的高排名候选dti也在DrugBank数据库和文献中得到了证实,证明了我们方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信