{"title":"Biomedical Information Retrieval incorporating Knowledge Graph for Explainable Precision Medicine","authors":"Zuoxi Yang","doi":"10.1145/3397271.3401458","DOIUrl":null,"url":null,"abstract":"As for many complex diseases, there is no \"one size fits all\" solutions for patients with a particular diagnosis in practice, which should be treated depends on patient's genetic, environmental, lifestyle choices and so on. Precision medicine can provide personalized treatment for a particular patient that has been drawn more and more attention. There are a large number of treatment options, which is overwhelming for clinicians to make best treatment for a particular patient. One of the effective ways to alleviate this problem is biomedical information retrieval system, which can automatically find out relevant information and proper treatment from mass of alternative treatments and cases. However, in the biomedical literature and clinical trials, there is a larger number of synonymous, polysemous and context terms, causing the semantic gap between query and document in traditional biomedical information retrieval systems. Recently, deep learning-based biomedical information retrieval systems have been adopted to address this problem, which has the potential improvements in the performance of BMIR. With these approaches, the semantic information of query and document would be encoded as low-dimensional feature vectors. Although most existing deep learning-based biomedical information retrieval systems can perform strong accuracy, they are usually treated as a black-box model that lack the explainability. It would be difficult for clinicians to understand their ranked results, which make them doubt the effectiveness of these systems. Reasonable explanations are profitable for clinicians to make better decisions via appropriate treatment logic inference, thus further enhancing the transparency, fairness and trust of biomedical information retrieval systems. Furthermore, knowledge graph has drawn more and more attention which contains abundant real-world facts and entities. It is an effective way to provide accuracy and explainability for deep learning model and reduce the knowledge gap between experts and publics. However, it is usually simply employed as a query expansion strategy simply into biomedical information retrieval systems. It remains an open question how to extend explainable biomedical information retrieval systems to knowledge graph. Given the above, to alleviate the tradeoff between accuracy and explainability of the precision medicine, we propose to research on Biomedical Information Retrieval incorporating Knowledge Graph for Explainable Precision Medicine. In this work, we propose a neural-based biomedical information retrieval model to address the semantic gap problem and fully investigate the utility of KG for the explainable biomedical information retrieval systems. which can soft-matches the query and document with semantic information instead of ranking the model by exact matches. On the one hand, our model encodes semantic feature information of documents by using convolutional neural networks, which perform strong ability to model text information in recent years. And the relevance between query and document would be measured via soft-matches rather than exact matches. On the other hand, the explainability is endowed to biomedical information retrieval model by extending the utility of knowledge graph. A graph-based strategy would be designed to achieve this goal by building knowledge-aware paths with the help of attention scores. Specifically, graph attention networks (GAT) would be adopted to model the query's representation by summarizing high-order connectivity from graph structure. With the help of GAT-level attention, the weight scores are automatically assigned to build knowledge-aware propagation connectivity which can be regarded as evidence for the further explainable biomedical information retrieval systems. Finally, the proposed system would be evaluated by the datasets from TREC Precision Medicine.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3397271.3401458","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
As for many complex diseases, there is no "one size fits all" solutions for patients with a particular diagnosis in practice, which should be treated depends on patient's genetic, environmental, lifestyle choices and so on. Precision medicine can provide personalized treatment for a particular patient that has been drawn more and more attention. There are a large number of treatment options, which is overwhelming for clinicians to make best treatment for a particular patient. One of the effective ways to alleviate this problem is biomedical information retrieval system, which can automatically find out relevant information and proper treatment from mass of alternative treatments and cases. However, in the biomedical literature and clinical trials, there is a larger number of synonymous, polysemous and context terms, causing the semantic gap between query and document in traditional biomedical information retrieval systems. Recently, deep learning-based biomedical information retrieval systems have been adopted to address this problem, which has the potential improvements in the performance of BMIR. With these approaches, the semantic information of query and document would be encoded as low-dimensional feature vectors. Although most existing deep learning-based biomedical information retrieval systems can perform strong accuracy, they are usually treated as a black-box model that lack the explainability. It would be difficult for clinicians to understand their ranked results, which make them doubt the effectiveness of these systems. Reasonable explanations are profitable for clinicians to make better decisions via appropriate treatment logic inference, thus further enhancing the transparency, fairness and trust of biomedical information retrieval systems. Furthermore, knowledge graph has drawn more and more attention which contains abundant real-world facts and entities. It is an effective way to provide accuracy and explainability for deep learning model and reduce the knowledge gap between experts and publics. However, it is usually simply employed as a query expansion strategy simply into biomedical information retrieval systems. It remains an open question how to extend explainable biomedical information retrieval systems to knowledge graph. Given the above, to alleviate the tradeoff between accuracy and explainability of the precision medicine, we propose to research on Biomedical Information Retrieval incorporating Knowledge Graph for Explainable Precision Medicine. In this work, we propose a neural-based biomedical information retrieval model to address the semantic gap problem and fully investigate the utility of KG for the explainable biomedical information retrieval systems. which can soft-matches the query and document with semantic information instead of ranking the model by exact matches. On the one hand, our model encodes semantic feature information of documents by using convolutional neural networks, which perform strong ability to model text information in recent years. And the relevance between query and document would be measured via soft-matches rather than exact matches. On the other hand, the explainability is endowed to biomedical information retrieval model by extending the utility of knowledge graph. A graph-based strategy would be designed to achieve this goal by building knowledge-aware paths with the help of attention scores. Specifically, graph attention networks (GAT) would be adopted to model the query's representation by summarizing high-order connectivity from graph structure. With the help of GAT-level attention, the weight scores are automatically assigned to build knowledge-aware propagation connectivity which can be regarded as evidence for the further explainable biomedical information retrieval systems. Finally, the proposed system would be evaluated by the datasets from TREC Precision Medicine.