中国电子病历细粒度生物医学关系提取的远程监控

Qing Zhao, Zhilong Ma, Jianqiang Li
{"title":"中国电子病历细粒度生物医学关系提取的远程监控","authors":"Qing Zhao, Zhilong Ma, Jianqiang Li","doi":"10.1109/ICNSC55942.2022.10004079","DOIUrl":null,"url":null,"abstract":"Automatically extract relations between medical entity pairs is fundamental in biomedical research. Since the annotated dataset is very expensive, distant supervision provides an efficient solution to reduce the cost of annotation by utilizing rough corpus labeled with semantic knowledge base. However, two same entities mentioned in different sentences may express different relations, it is difficult for the traditional distant supervision methods to distinguish these different relations. In this paper, we propose a new model for biomedical relation extraction in Chinese EMRs. First, the distant supervision is used for coarse-grained relation labeling. Then, the fine-grained relations are annotated initially by measuring the distance between the contextual information of the relation instance to the semantic profile of each candidate fine-grained relation category. Finally, the high confidence fine-grained relation instances are selected as initial training set for PCNN model, in addition, a bootstrap learning is introduced in the training process to enhance the performance of fine-grained relation extraction. Experiments conducted on a real-word dataset and the results show that our method outperforms all baseline systems.","PeriodicalId":230499,"journal":{"name":"2022 IEEE International Conference on Networking, Sensing and Control (ICNSC)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Distant supervision for fine-grained biomedical relation extraction from Chinese EMRs\",\"authors\":\"Qing Zhao, Zhilong Ma, Jianqiang Li\",\"doi\":\"10.1109/ICNSC55942.2022.10004079\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatically extract relations between medical entity pairs is fundamental in biomedical research. Since the annotated dataset is very expensive, distant supervision provides an efficient solution to reduce the cost of annotation by utilizing rough corpus labeled with semantic knowledge base. However, two same entities mentioned in different sentences may express different relations, it is difficult for the traditional distant supervision methods to distinguish these different relations. In this paper, we propose a new model for biomedical relation extraction in Chinese EMRs. First, the distant supervision is used for coarse-grained relation labeling. Then, the fine-grained relations are annotated initially by measuring the distance between the contextual information of the relation instance to the semantic profile of each candidate fine-grained relation category. Finally, the high confidence fine-grained relation instances are selected as initial training set for PCNN model, in addition, a bootstrap learning is introduced in the training process to enhance the performance of fine-grained relation extraction. Experiments conducted on a real-word dataset and the results show that our method outperforms all baseline systems.\",\"PeriodicalId\":230499,\"journal\":{\"name\":\"2022 IEEE International Conference on Networking, Sensing and Control (ICNSC)\",\"volume\":\"80 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Networking, Sensing and Control (ICNSC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICNSC55942.2022.10004079\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Networking, Sensing and Control (ICNSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNSC55942.2022.10004079","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

医学实体对之间关系的自动提取是生物医学研究的基础。由于标注的数据集非常昂贵,远程监督利用带有语义知识库的粗糙语料库为降低标注成本提供了一种有效的解决方案。然而,在不同的句子中提到的两个相同的实体可能表达不同的关系,传统的远程监督方法难以区分这些不同的关系。本文提出了一种中文电子病历中生物医学关系提取的新模型。首先,将远程监督用于粗粒度关系标注。然后,通过测量关系实例的上下文信息到每个候选细粒度关系类别的语义概要之间的距离,对细粒度关系进行初始注释。最后,选择高置信度的细粒度关系实例作为PCNN模型的初始训练集,并在训练过程中引入自举学习来提高细粒度关系提取的性能。在真实数据集上进行的实验结果表明,我们的方法优于所有基线系统。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Distant supervision for fine-grained biomedical relation extraction from Chinese EMRs
Automatically extract relations between medical entity pairs is fundamental in biomedical research. Since the annotated dataset is very expensive, distant supervision provides an efficient solution to reduce the cost of annotation by utilizing rough corpus labeled with semantic knowledge base. However, two same entities mentioned in different sentences may express different relations, it is difficult for the traditional distant supervision methods to distinguish these different relations. In this paper, we propose a new model for biomedical relation extraction in Chinese EMRs. First, the distant supervision is used for coarse-grained relation labeling. Then, the fine-grained relations are annotated initially by measuring the distance between the contextual information of the relation instance to the semantic profile of each candidate fine-grained relation category. Finally, the high confidence fine-grained relation instances are selected as initial training set for PCNN model, in addition, a bootstrap learning is introduced in the training process to enhance the performance of fine-grained relation extraction. Experiments conducted on a real-word dataset and the results show that our method outperforms all baseline systems.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信