Drug side effect prediction through linear neighborhoods and multiple data source integration

Wen Zhang, Yanlin Chen, Shikui Tu, Feng Liu, Qianlong Qu
{"title":"Drug side effect prediction through linear neighborhoods and multiple data source integration","authors":"Wen Zhang, Yanlin Chen, Shikui Tu, Feng Liu, Qianlong Qu","doi":"10.1109/BIBM.2016.7822555","DOIUrl":null,"url":null,"abstract":"predicting drug side effects is a critical task in the drug discovery, which attracts great attentions in both academy and industry. Although lots of machine learning methods have been proposed, great challenges arise with boom of precision medicine. On one hand, many methods are based on the assumption that similar drugs may share same side effects, but measuring the drug-drug similarity appropriately is challenging. One the other hand, multi-source data provide diverse information for the analysis of side effects, and should be integrated for the high-accuracy prediction. In this paper, we tackle the side effect prediction problem through linear neighborhoods and multi-source data integration. In the feature space, linear neighborhoods are constructed to extract the drug-drug similarity, namely “linear neighborhood similarity”. By transferring the similarity into the side effect space, known side effect information is propagated through the similarity-based graph. Thus, we propose the linear neighborhood similarity method (LNSM), which utilizes single-source data for the side effect prediction. Further, we extend LNSM to deal with multi-source data, and propose two data integration methods: similarity matrix integration method (LNSM-SMI) and cost minimization integration method (LNSM-CMI), which integrate drug substructure data, drug target data, drug transporter data, drug enzyme data, drug pathway data and drug indication data to improve the prediction accuracy. The proposed methods are evaluated on the benchmark datasets. The linear neighborhood similarity method (LNSM) can produce satisfying results on the single-source data. Data integration methods (LNSM-SMI and LNSM-CMI) can effectively integrate multi-source data, and outperform other state-of-the-art side effect prediction methods in the cross validation and independent test. The proposed methods are promising for the drug side effect prediction.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"53","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2016.7822555","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 53

Abstract

predicting drug side effects is a critical task in the drug discovery, which attracts great attentions in both academy and industry. Although lots of machine learning methods have been proposed, great challenges arise with boom of precision medicine. On one hand, many methods are based on the assumption that similar drugs may share same side effects, but measuring the drug-drug similarity appropriately is challenging. One the other hand, multi-source data provide diverse information for the analysis of side effects, and should be integrated for the high-accuracy prediction. In this paper, we tackle the side effect prediction problem through linear neighborhoods and multi-source data integration. In the feature space, linear neighborhoods are constructed to extract the drug-drug similarity, namely “linear neighborhood similarity”. By transferring the similarity into the side effect space, known side effect information is propagated through the similarity-based graph. Thus, we propose the linear neighborhood similarity method (LNSM), which utilizes single-source data for the side effect prediction. Further, we extend LNSM to deal with multi-source data, and propose two data integration methods: similarity matrix integration method (LNSM-SMI) and cost minimization integration method (LNSM-CMI), which integrate drug substructure data, drug target data, drug transporter data, drug enzyme data, drug pathway data and drug indication data to improve the prediction accuracy. The proposed methods are evaluated on the benchmark datasets. The linear neighborhood similarity method (LNSM) can produce satisfying results on the single-source data. Data integration methods (LNSM-SMI and LNSM-CMI) can effectively integrate multi-source data, and outperform other state-of-the-art side effect prediction methods in the cross validation and independent test. The proposed methods are promising for the drug side effect prediction.
基于线性邻域和多数据源集成的药物副作用预测
药物副作用预测是药物发现中的一项关键任务,受到学术界和产业界的高度关注。虽然已经提出了许多机器学习方法,但随着精准医疗的蓬勃发展,也带来了巨大的挑战。一方面,许多方法是基于类似药物可能具有相同副作用的假设,但适当地测量药物-药物相似性是具有挑战性的。另一方面,多源数据为副作用的分析提供了多样化的信息,为了进行高精度的预测,需要对这些数据进行整合。本文采用线性邻域和多源数据集成的方法解决了副作用预测问题。在特征空间中,构建线性邻域提取药物-药物相似度,即“线性邻域相似度”。通过将相似度转移到副作用空间中,通过基于相似度的图传播已知的副作用信息。因此,我们提出了线性邻域相似法(LNSM),该方法利用单源数据进行副作用预测。进一步,我们将LNSM扩展到多源数据,提出了两种数据集成方法:相似矩阵集成方法(LNSM- smi)和成本最小化集成方法(LNSM- cmi),通过整合药物子结构数据、药物靶点数据、药物转运体数据、药物酶数据、药物通路数据和药物适应症数据来提高预测精度。在基准数据集上对所提出的方法进行了评估。线性邻域相似法(LNSM)在单源数据上可以得到令人满意的结果。数据集成方法(LNSM-SMI和LNSM-CMI)可以有效集成多源数据,在交叉验证和独立检验方面优于其他先进的副作用预测方法。该方法在药物副作用预测方面具有广阔的应用前景。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信