中医问答系统

Xiangzhou Huang, Yin Zhang, Baogang Wei, Liang Yao
{"title":"中医问答系统","authors":"Xiangzhou Huang, Yin Zhang, Baogang Wei, Liang Yao","doi":"10.1109/BIBM.2015.7359945","DOIUrl":null,"url":null,"abstract":"Traditional Chinese Medicine (TCM) has been around for thousands of years and it's a significant part of Chinese cultural heritage. The theoretical framework of TCM is unique and with rich of content, which contains the complex relationships between disease and medicine. Research on question-answering (QA) over TCM is significant for Chinese NLP and representative, because the resources of TCM are mostly Chinese-based. In this paper we present a QA system over TCM, which transforms user supplied questions into conjunctive query sentences (i.e. SQL) and retrieves the answer from both the built-up dataset and online encyclopedia. The contribution of this paper is threefold: Firstly, we introduce a novel approach for word segmentation over Chinese questions. We employ a TF-IDF model on the dataset to generate domain-specific dictionary with weight factor and tags, which are computed to select the best result of segmentation. Secondly, we present a novel method for constructing queries to retrieve answers. We compute the entity-attribute distance over a set of tagged words to construct incomplete ontology instances, which are used as the intermediary to generate queries. Lastly, we propose a method to integrate web data extraction with question answering, which allows us to extract answers from online encyclopedia website (i.e. Wikipedia). The results of our evaluation with 50 benchmark queries demonstrate the value of our approach.","PeriodicalId":186217,"journal":{"name":"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A question-answering system over Traditional Chinese Medicine\",\"authors\":\"Xiangzhou Huang, Yin Zhang, Baogang Wei, Liang Yao\",\"doi\":\"10.1109/BIBM.2015.7359945\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Traditional Chinese Medicine (TCM) has been around for thousands of years and it's a significant part of Chinese cultural heritage. The theoretical framework of TCM is unique and with rich of content, which contains the complex relationships between disease and medicine. Research on question-answering (QA) over TCM is significant for Chinese NLP and representative, because the resources of TCM are mostly Chinese-based. In this paper we present a QA system over TCM, which transforms user supplied questions into conjunctive query sentences (i.e. SQL) and retrieves the answer from both the built-up dataset and online encyclopedia. The contribution of this paper is threefold: Firstly, we introduce a novel approach for word segmentation over Chinese questions. We employ a TF-IDF model on the dataset to generate domain-specific dictionary with weight factor and tags, which are computed to select the best result of segmentation. Secondly, we present a novel method for constructing queries to retrieve answers. We compute the entity-attribute distance over a set of tagged words to construct incomplete ontology instances, which are used as the intermediary to generate queries. Lastly, we propose a method to integrate web data extraction with question answering, which allows us to extract answers from online encyclopedia website (i.e. Wikipedia). The results of our evaluation with 50 benchmark queries demonstrate the value of our approach.\",\"PeriodicalId\":186217,\"journal\":{\"name\":\"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)\",\"volume\":\"84 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-11-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBM.2015.7359945\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2015.7359945","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

中医(TCM)已经存在了数千年,是中国文化遗产的重要组成部分。中医理论框架独特,内容丰富,包含了疾病与医学的复杂关系。中医药的问答研究对中国自然语言处理具有重要意义和代表性,因为中医药的资源大多以中文为基础。在本文中,我们提出了一个基于TCM的问答系统,该系统将用户提供的问题转换为连接查询句(即SQL),并从构建的数据集和在线百科全书中检索答案。本文的贡献有三个方面:首先,我们引入了一种新的中文问题分词方法。我们利用TF-IDF模型在数据集上生成具有权重因子和标签的特定领域词典,并对其进行计算以选择最佳分割结果。其次,我们提出了一种构造查询来检索答案的新方法。我们计算一组标记词的实体-属性距离,以构建不完整的本体实例,这些实例用作生成查询的中介。最后,我们提出了一种将web数据提取与问答相结合的方法,该方法允许我们从在线百科全书网站(即维基百科)中提取答案。我们用50个基准查询进行评估的结果证明了我们方法的价值。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A question-answering system over Traditional Chinese Medicine
Traditional Chinese Medicine (TCM) has been around for thousands of years and it's a significant part of Chinese cultural heritage. The theoretical framework of TCM is unique and with rich of content, which contains the complex relationships between disease and medicine. Research on question-answering (QA) over TCM is significant for Chinese NLP and representative, because the resources of TCM are mostly Chinese-based. In this paper we present a QA system over TCM, which transforms user supplied questions into conjunctive query sentences (i.e. SQL) and retrieves the answer from both the built-up dataset and online encyclopedia. The contribution of this paper is threefold: Firstly, we introduce a novel approach for word segmentation over Chinese questions. We employ a TF-IDF model on the dataset to generate domain-specific dictionary with weight factor and tags, which are computed to select the best result of segmentation. Secondly, we present a novel method for constructing queries to retrieve answers. We compute the entity-attribute distance over a set of tagged words to construct incomplete ontology instances, which are used as the intermediary to generate queries. Lastly, we propose a method to integrate web data extraction with question answering, which allows us to extract answers from online encyclopedia website (i.e. Wikipedia). The results of our evaluation with 50 benchmark queries demonstrate the value of our approach.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信