KGML-xDTD: a knowledge graph–based machine learning framework for drug treatment prediction and mechanism description

IF 11.8 2区生物学 Q1 MULTIDISCIPLINARY SCIENCES

GigaScience Pub Date : 2022-11-30 DOI:10.1101/2022.11.29.518441

Chunyu Ma, Zhihan Zhou, Han Liu, D. Koslicki

{"title":"KGML-xDTD: a knowledge graph–based machine learning framework for drug treatment prediction and mechanism description","authors":"Chunyu Ma, Zhihan Zhou, Han Liu, D. Koslicki","doi":"10.1101/2022.11.29.518441","DOIUrl":null,"url":null,"abstract":"Background Computational drug repurposing is a cost- and time-efficient approach that aims to identify new therapeutic targets or diseases (indications) of existing drugs/compounds. It is especially critical for emerging and/or orphan diseases due to its cheaper investment and shorter research cycle compared with traditional wet-lab drug discovery approaches. However, the underlying mechanisms of action (MOAs) between repurposed drugs and their target diseases remain largely unknown, which is still a main obstacle for computational drug repurposing methods to be widely adopted in clinical settings. Results In this work, we propose KGML-xDTD : a Knowledge Graph-based Machine Learning framework for explainably predicting Drugs Treating Diseases. It is a two-module framework that not only predicts the treatment probabilities between drugs/compounds and diseases but also biologically explains them via knowledge graph (KG) path-based, testable mechanisms of action (MOAs). We leverage knowledge-and-publication based information to extract biologically meaningful “demonstration paths” as the intermediate guidance in the Graph-based Reinforcement Learning (GRL) path-finding process. Comprehensive experiments and case study analyses show that the proposed framework can achieve state-of-the-art performance in both predictions of drug repurposing and recapitulation of human-curated drug MOA paths. Conclusions KGML-xDTD is the first model framework that can offer KG-path explanations for drug repurposing predictions by leveraging the combination of prediction outcomes and existing biological knowledge and publications. We believe it can effectively reduce “black-box” concerns and increase prediction confidence for drug repurposing based on predicted path-based explanations, and further accelerate the process of drug discovery for emerging diseases.","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":" ","pages":""},"PeriodicalIF":11.8000,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"GigaScience","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1101/2022.11.29.518441","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 2

Abstract

Background Computational drug repurposing is a cost- and time-efficient approach that aims to identify new therapeutic targets or diseases (indications) of existing drugs/compounds. It is especially critical for emerging and/or orphan diseases due to its cheaper investment and shorter research cycle compared with traditional wet-lab drug discovery approaches. However, the underlying mechanisms of action (MOAs) between repurposed drugs and their target diseases remain largely unknown, which is still a main obstacle for computational drug repurposing methods to be widely adopted in clinical settings. Results In this work, we propose KGML-xDTD : a Knowledge Graph-based Machine Learning framework for explainably predicting Drugs Treating Diseases. It is a two-module framework that not only predicts the treatment probabilities between drugs/compounds and diseases but also biologically explains them via knowledge graph (KG) path-based, testable mechanisms of action (MOAs). We leverage knowledge-and-publication based information to extract biologically meaningful “demonstration paths” as the intermediate guidance in the Graph-based Reinforcement Learning (GRL) path-finding process. Comprehensive experiments and case study analyses show that the proposed framework can achieve state-of-the-art performance in both predictions of drug repurposing and recapitulation of human-curated drug MOA paths. Conclusions KGML-xDTD is the first model framework that can offer KG-path explanations for drug repurposing predictions by leveraging the combination of prediction outcomes and existing biological knowledge and publications. We believe it can effectively reduce “black-box” concerns and increase prediction confidence for drug repurposing based on predicted path-based explanations, and further accelerate the process of drug discovery for emerging diseases.

查看原文本刊更多论文

KGML-xDTD:用于药物治疗预测和机制描述的基于知识图的机器学习框架

计算药物再利用是一种成本和时间效率高的方法，旨在确定现有药物/化合物的新的治疗靶点或疾病(指征)。由于与传统的湿实验室药物发现方法相比，它的投资更便宜，研究周期更短，因此对新兴和/或孤儿疾病尤为重要。然而，重新利用药物及其靶疾病之间的潜在作用机制(MOAs)在很大程度上仍然未知，这仍然是计算药物重新利用方法在临床环境中广泛采用的主要障碍。在这项工作中，我们提出了KGML-xDTD:一个基于知识图的机器学习框架，用于可解释地预测治疗疾病的药物。它是一个双模块框架，不仅可以预测药物/化合物与疾病之间的治疗概率，还可以通过基于知识图(KG)路径的可测试的作用机制(MOAs)进行生物学解释。我们利用基于知识和出版物的信息来提取生物学上有意义的“示范路径”，作为基于图的强化学习(GRL)寻路过程中的中间指导。综合实验和案例研究分析表明，所提出的框架在药物再利用预测和人类策划的药物MOA路径重述方面都能达到最先进的性能。结论KGML-xDTD是第一个能够将预测结果与现有生物学知识和出版物相结合，为药物再利用预测提供kg路径解释的模型框架。我们相信它可以有效地减少“黑箱”问题，提高基于预测路径解释的药物再利用预测的可信度，进一步加快新发疾病的药物发现进程。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

GigaScience MULTIDISCIPLINARY SCIENCES-

CiteScore

15.50

自引率

1.10%

发文量

119

审稿时长

1 weeks

期刊介绍： GigaScience seeks to transform data dissemination and utilization in the life and biomedical sciences. As an online open-access open-data journal, it specializes in publishing "big-data" studies encompassing various fields. Its scope includes not only "omic" type data and the fields of high-throughput biology currently serviced by large public repositories, but also the growing range of more difficult-to-access data, such as imaging, neuroscience, ecology, cohort data, systems biology and other new types of large-scale shareable data.