Fast Polypharmacy Side Effect Prediction Using Tensor Factorisation.

Oliver Lloyd, Yi Liu, Tom R Gaunt
{"title":"Fast Polypharmacy Side Effect Prediction Using Tensor Factorisation.","authors":"Oliver Lloyd, Yi Liu, Tom R Gaunt","doi":"10.1093/bioinformatics/btae706","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Adverse reactions from drug combinations are increasingly common, making their accurate prediction a crucial challenge in modern medicine. Laboratory-based identification of these reactions is insufficient due to the combinatorial nature of the problem. While many computational approaches have been proposed, tensor factorisation models have shown mixed results, necessitating a thorough investigation of their capabilities when properly optimized.</p><p><strong>Results: </strong>We demonstrate that tensor factorisation models can achieve state-of-the-art performance on polypharmacy side effect prediction, with our best model (SimplE) achieving median scores of 0.978 AUROC, 0.971 AUPRC, and 1.000 AP@50 across 963 side effects. Notably, this model reaches 98.3% of its maximum performance after just two epochs of training (approximately 4 minutes), making it substantially faster than existing approaches while maintaining comparable accuracy. We also find that incorporating monopharmacy data as self-looping edges in the graph performs marginally better than using it to initialize embeddings.</p><p><strong>Availability and implementation: </strong>All code used in the experiments is available in our GitHub repository (https://doi.org/10.5281/zenodo.10684402). The implementation was carried out using Python 3.8.12 with PyTorch 1.7.1, accelerated with CUDA 11.4 on NVIDIA GeForce RTX 2080 Ti GPUs.</p><p><strong>Supplementary information: </strong>Supplementary data, including precision-recall curves and F1 curves for the best performing model, are available at Bioinformatics online.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btae706","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Motivation: Adverse reactions from drug combinations are increasingly common, making their accurate prediction a crucial challenge in modern medicine. Laboratory-based identification of these reactions is insufficient due to the combinatorial nature of the problem. While many computational approaches have been proposed, tensor factorisation models have shown mixed results, necessitating a thorough investigation of their capabilities when properly optimized.

Results: We demonstrate that tensor factorisation models can achieve state-of-the-art performance on polypharmacy side effect prediction, with our best model (SimplE) achieving median scores of 0.978 AUROC, 0.971 AUPRC, and 1.000 AP@50 across 963 side effects. Notably, this model reaches 98.3% of its maximum performance after just two epochs of training (approximately 4 minutes), making it substantially faster than existing approaches while maintaining comparable accuracy. We also find that incorporating monopharmacy data as self-looping edges in the graph performs marginally better than using it to initialize embeddings.

Availability and implementation: All code used in the experiments is available in our GitHub repository (https://doi.org/10.5281/zenodo.10684402). The implementation was carried out using Python 3.8.12 with PyTorch 1.7.1, accelerated with CUDA 11.4 on NVIDIA GeForce RTX 2080 Ti GPUs.

Supplementary information: Supplementary data, including precision-recall curves and F1 curves for the best performing model, are available at Bioinformatics online.

利用张量因式分解快速预测多药副作用
动机联合用药引起的不良反应越来越常见,因此准确预测这些不良反应成为现代医学的一项重要挑战。由于问题的组合性质,基于实验室的不良反应识别是不够的。虽然已经提出了许多计算方法,但张量因式分解模型的结果好坏参半,因此有必要对其适当优化后的能力进行深入研究:我们证明了张量因式分解模型可以在多药副作用预测方面达到最先进的性能,我们的最佳模型(SimplE)在 963 种副作用中取得了 0.978 AUROC、0.971 AUPRC 和 1.000 AP@50 的中位分数。值得注意的是,该模型仅经过两个历元的训练(约 4 分钟)就达到了最高性能的 98.3%,这使其在保持可比准确性的同时,大大快于现有方法。我们还发现,将单药疗法数据作为图中的自循环边,其性能略优于使用单药疗法数据来初始化嵌入:实验中使用的所有代码都可以在我们的 GitHub 代码库中找到(https://doi.org/10.5281/zenodo.10684402)。实现过程使用 Python 3.8.12 和 PyTorch 1.7.1,在 NVIDIA GeForce RTX 2080 Ti GPU 上使用 CUDA 11.4 加速:补充数据,包括精度-召回曲线和最佳性能模型的 F1 曲线,可在 Bioinformatics online 上获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信