Biomedical Information Retrieval incorporating Knowledge Graph for Explainable Precision Medicine

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval Pub Date : 2020-07-25 DOI:10.1145/3397271.3401458

Zuoxi Yang

{"title":"Biomedical Information Retrieval incorporating Knowledge Graph for Explainable Precision Medicine","authors":"Zuoxi Yang","doi":"10.1145/3397271.3401458","DOIUrl":null,"url":null,"abstract":"As for many complex diseases, there is no \"one size fits all\" solutions for patients with a particular diagnosis in practice, which should be treated depends on patient's genetic, environmental, lifestyle choices and so on. Precision medicine can provide personalized treatment for a particular patient that has been drawn more and more attention. There are a large number of treatment options, which is overwhelming for clinicians to make best treatment for a particular patient. One of the effective ways to alleviate this problem is biomedical information retrieval system, which can automatically find out relevant information and proper treatment from mass of alternative treatments and cases. However, in the biomedical literature and clinical trials, there is a larger number of synonymous, polysemous and context terms, causing the semantic gap between query and document in traditional biomedical information retrieval systems. Recently, deep learning-based biomedical information retrieval systems have been adopted to address this problem, which has the potential improvements in the performance of BMIR. With these approaches, the semantic information of query and document would be encoded as low-dimensional feature vectors. Although most existing deep learning-based biomedical information retrieval systems can perform strong accuracy, they are usually treated as a black-box model that lack the explainability. It would be difficult for clinicians to understand their ranked results, which make them doubt the effectiveness of these systems. Reasonable explanations are profitable for clinicians to make better decisions via appropriate treatment logic inference, thus further enhancing the transparency, fairness and trust of biomedical information retrieval systems. Furthermore, knowledge graph has drawn more and more attention which contains abundant real-world facts and entities. It is an effective way to provide accuracy and explainability for deep learning model and reduce the knowledge gap between experts and publics. However, it is usually simply employed as a query expansion strategy simply into biomedical information retrieval systems. It remains an open question how to extend explainable biomedical information retrieval systems to knowledge graph. Given the above, to alleviate the tradeoff between accuracy and explainability of the precision medicine, we propose to research on Biomedical Information Retrieval incorporating Knowledge Graph for Explainable Precision Medicine. In this work, we propose a neural-based biomedical information retrieval model to address the semantic gap problem and fully investigate the utility of KG for the explainable biomedical information retrieval systems. which can soft-matches the query and document with semantic information instead of ranking the model by exact matches. On the one hand, our model encodes semantic feature information of documents by using convolutional neural networks, which perform strong ability to model text information in recent years. And the relevance between query and document would be measured via soft-matches rather than exact matches. On the other hand, the explainability is endowed to biomedical information retrieval model by extending the utility of knowledge graph. A graph-based strategy would be designed to achieve this goal by building knowledge-aware paths with the help of attention scores. Specifically, graph attention networks (GAT) would be adopted to model the query's representation by summarizing high-order connectivity from graph structure. With the help of GAT-level attention, the weight scores are automatically assigned to build knowledge-aware propagation connectivity which can be regarded as evidence for the further explainable biomedical information retrieval systems. Finally, the proposed system would be evaluated by the datasets from TREC Precision Medicine.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3397271.3401458","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

Abstract

As for many complex diseases, there is no "one size fits all" solutions for patients with a particular diagnosis in practice, which should be treated depends on patient's genetic, environmental, lifestyle choices and so on. Precision medicine can provide personalized treatment for a particular patient that has been drawn more and more attention. There are a large number of treatment options, which is overwhelming for clinicians to make best treatment for a particular patient. One of the effective ways to alleviate this problem is biomedical information retrieval system, which can automatically find out relevant information and proper treatment from mass of alternative treatments and cases. However, in the biomedical literature and clinical trials, there is a larger number of synonymous, polysemous and context terms, causing the semantic gap between query and document in traditional biomedical information retrieval systems. Recently, deep learning-based biomedical information retrieval systems have been adopted to address this problem, which has the potential improvements in the performance of BMIR. With these approaches, the semantic information of query and document would be encoded as low-dimensional feature vectors. Although most existing deep learning-based biomedical information retrieval systems can perform strong accuracy, they are usually treated as a black-box model that lack the explainability. It would be difficult for clinicians to understand their ranked results, which make them doubt the effectiveness of these systems. Reasonable explanations are profitable for clinicians to make better decisions via appropriate treatment logic inference, thus further enhancing the transparency, fairness and trust of biomedical information retrieval systems. Furthermore, knowledge graph has drawn more and more attention which contains abundant real-world facts and entities. It is an effective way to provide accuracy and explainability for deep learning model and reduce the knowledge gap between experts and publics. However, it is usually simply employed as a query expansion strategy simply into biomedical information retrieval systems. It remains an open question how to extend explainable biomedical information retrieval systems to knowledge graph. Given the above, to alleviate the tradeoff between accuracy and explainability of the precision medicine, we propose to research on Biomedical Information Retrieval incorporating Knowledge Graph for Explainable Precision Medicine. In this work, we propose a neural-based biomedical information retrieval model to address the semantic gap problem and fully investigate the utility of KG for the explainable biomedical information retrieval systems. which can soft-matches the query and document with semantic information instead of ranking the model by exact matches. On the one hand, our model encodes semantic feature information of documents by using convolutional neural networks, which perform strong ability to model text information in recent years. And the relevance between query and document would be measured via soft-matches rather than exact matches. On the other hand, the explainability is endowed to biomedical information retrieval model by extending the utility of knowledge graph. A graph-based strategy would be designed to achieve this goal by building knowledge-aware paths with the help of attention scores. Specifically, graph attention networks (GAT) would be adopted to model the query's representation by summarizing high-order connectivity from graph structure. With the help of GAT-level attention, the weight scores are automatically assigned to build knowledge-aware propagation connectivity which can be regarded as evidence for the further explainable biomedical information retrieval systems. Finally, the proposed system would be evaluated by the datasets from TREC Precision Medicine.

查看原文本刊更多论文

基于知识图谱的可解释精准医学生物医学信息检索

对于许多复杂的疾病，在实践中并没有针对特定诊断的患者“一刀切”的解决方案，应该根据患者的遗传、环境、生活方式选择等进行治疗。精准医疗可以为特定患者提供个性化治疗，这一点越来越受到人们的关注。有大量的治疗方案，这是压倒性的临床医生为一个特定的病人做出最好的治疗。生物医学信息检索系统是缓解这一问题的有效途径之一，该系统可以从大量的替代治疗和病例中自动找到相关的信息和合适的治疗方法。然而，在生物医学文献和临床试验中，存在大量同义、多义和上下文术语，导致传统生物医学信息检索系统中查询与文献之间存在语义差距。近年来，基于深度学习的生物医学信息检索系统被用于解决这一问题，这有可能提高BMIR的性能。这些方法将查询和文档的语义信息编码为低维特征向量。现有的基于深度学习的生物医学信息检索系统虽然具有较强的准确性，但通常被视为缺乏可解释性的黑箱模型。临床医生很难理解他们的排名结果，这使他们怀疑这些系统的有效性。合理的解释有利于临床医生通过适当的治疗逻辑推理做出更好的决策，从而进一步提高生物医学信息检索系统的透明度、公平性和信任度。知识图谱包含了丰富的现实世界的事实和实体，越来越受到人们的重视。这是为深度学习模型提供准确性和可解释性，缩小专家与公众之间知识差距的有效途径。然而，在生物医学信息检索系统中，它通常只是作为一种查询扩展策略。如何将可解释的生物医学信息检索系统扩展到知识图谱，是一个有待解决的问题。鉴于此，为了缓解精准医疗的准确性和可解释性之间的权衡，我们提出了基于知识图谱的可解释精准医疗生物医学信息检索研究。在这项工作中，我们提出了一个基于神经的生物医学信息检索模型来解决语义缺口问题，并充分研究了KG在可解释生物医学信息检索系统中的应用。它可以用语义信息对查询和文档进行软匹配，而不是通过精确匹配对模型进行排序。一方面，我们的模型利用卷积神经网络对文档的语义特征信息进行编码，卷积神经网络近年来在文本信息建模方面表现出较强的能力。查询和文档之间的相关性将通过软匹配而不是精确匹配来衡量。另一方面，通过扩展知识图谱的实用性，赋予生物医学信息检索模型可解释性。基于图的策略可以通过在注意力分数的帮助下构建知识感知路径来实现这一目标。具体而言，将采用图注意网络(GAT)通过从图结构中总结高阶连通性来建模查询的表示。在gat级关注的帮助下，自动分配权重分数来构建知识感知的传播连通性，这可以作为进一步可解释的生物医学信息检索系统的证据。最后，该系统将通过TREC精密医学的数据集进行评估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

自引率

0.00%

发文量