Exploiting collaborative learning for concept extraction in the medical field

Meng Tian, Jianqiang Li, Jijiang Yang, Bo Liu, Xi Meng, Ronghua Li, J. Bi
{"title":"Exploiting collaborative learning for concept extraction in the medical field","authors":"Meng Tian, Jianqiang Li, Jijiang Yang, Bo Liu, Xi Meng, Ronghua Li, J. Bi","doi":"10.1145/3018009.3018054","DOIUrl":null,"url":null,"abstract":"With the increasing interests of second use of medical data, concept extraction in Electronic Medical Records has drawn more and more scholars' attention. Owing to the artificial data annotation task is labor intensive, the method of concept extraction is mainly to use the fully labeled documents as training data in order to build a concept instance identifier. However, in many cases, the available training data are sparse labeling. This fact makes the performance of the constructed classifier is poor. Existing methods for extracting concepts either considered the diversity of datasets or considered the various learning models. Therefore, this paper proposes a novel approach to improve the performance of concept extraction from electronic medical records by combining the diversity of datasets with the various learning models. The large sparsely labeled dataset is split into multiple subsets. Then the different subsets are trained by different learning models, such as HMM, MEMM, and CRF, in an iterative way. Our technique leverages off the fact that different learning algorithms have different inductive biases and that better predictions can be made by the voted majority.","PeriodicalId":189252,"journal":{"name":"Proceedings of the 2nd International Conference on Communication and Information Processing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd International Conference on Communication and Information Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3018009.3018054","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

With the increasing interests of second use of medical data, concept extraction in Electronic Medical Records has drawn more and more scholars' attention. Owing to the artificial data annotation task is labor intensive, the method of concept extraction is mainly to use the fully labeled documents as training data in order to build a concept instance identifier. However, in many cases, the available training data are sparse labeling. This fact makes the performance of the constructed classifier is poor. Existing methods for extracting concepts either considered the diversity of datasets or considered the various learning models. Therefore, this paper proposes a novel approach to improve the performance of concept extraction from electronic medical records by combining the diversity of datasets with the various learning models. The large sparsely labeled dataset is split into multiple subsets. Then the different subsets are trained by different learning models, such as HMM, MEMM, and CRF, in an iterative way. Our technique leverages off the fact that different learning algorithms have different inductive biases and that better predictions can be made by the voted majority.
协同学习在医学领域概念提取中的应用
随着人们对医疗数据二次利用的兴趣日益浓厚,电子病历中的概念提取受到越来越多学者的关注。由于人工数据标注任务是劳动密集型的,概念抽取的方法主要是使用完全标注的文档作为训练数据来构建概念实例标识符。然而,在许多情况下,可用的训练数据是稀疏标记的。这一事实使得构造的分类器的性能很差。现有的概念提取方法要么考虑数据集的多样性,要么考虑各种学习模型。因此,本文提出了一种新的方法,将数据集的多样性与各种学习模型相结合,以提高电子病历概念提取的性能。将大型稀疏标记数据集分成多个子集。然后使用HMM、MEMM和CRF等不同的学习模型对不同的子集进行迭代训练。我们的技术利用了这样一个事实,即不同的学习算法有不同的归纳偏差,并且通过投票多数可以做出更好的预测。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信