基于机器学习的糖尿病肾病临床文献信息提取

X. Bao, Shuanglian Xie, Kai Zhang, Kai Song, Yunhaonan Yang
{"title":"基于机器学习的糖尿病肾病临床文献信息提取","authors":"X. Bao, Shuanglian Xie, Kai Zhang, Kai Song, Yunhaonan Yang","doi":"10.1109/ICSAI48974.2019.9010211","DOIUrl":null,"url":null,"abstract":"Diabetic nephropathy is common complication of diabetes mellitus, it's important to intervene early. For building a predictive model for diabetic nephropathy. In order to extract relevant information as a prediction risk factor, we construct a golden standard corpus. 3422 admission summary notes from 2013 to 2017 in a tertiary hospital were included in the study. An information extraction method based on machine learning models is proposed to extract important information from unstructured medical record texts, in which Adaboost on Duration of Diabetes has best performance (F1=0.97), and Family history of heart disease extraction is most challenge, F1 value of best model result SVM is 0.73. The best performance of the other six types of information extraction model is between 0.85 and 0.96, and the practical application is feasible.","PeriodicalId":270809,"journal":{"name":"2019 6th International Conference on Systems and Informatics (ICSAI)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Machine Learning Based Information Extraction for Diabetic Nephropathy in Clinical Text Documents\",\"authors\":\"X. Bao, Shuanglian Xie, Kai Zhang, Kai Song, Yunhaonan Yang\",\"doi\":\"10.1109/ICSAI48974.2019.9010211\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Diabetic nephropathy is common complication of diabetes mellitus, it's important to intervene early. For building a predictive model for diabetic nephropathy. In order to extract relevant information as a prediction risk factor, we construct a golden standard corpus. 3422 admission summary notes from 2013 to 2017 in a tertiary hospital were included in the study. An information extraction method based on machine learning models is proposed to extract important information from unstructured medical record texts, in which Adaboost on Duration of Diabetes has best performance (F1=0.97), and Family history of heart disease extraction is most challenge, F1 value of best model result SVM is 0.73. The best performance of the other six types of information extraction model is between 0.85 and 0.96, and the practical application is feasible.\",\"PeriodicalId\":270809,\"journal\":{\"name\":\"2019 6th International Conference on Systems and Informatics (ICSAI)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 6th International Conference on Systems and Informatics (ICSAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSAI48974.2019.9010211\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 6th International Conference on Systems and Informatics (ICSAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSAI48974.2019.9010211","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

糖尿病肾病是糖尿病的常见并发症,早期干预十分重要。建立糖尿病肾病的预测模型。为了提取相关信息作为预测风险因素,我们构建了一个黄金标准语料库。某三级医院2013 - 2017年3422份住院总结记录纳入研究。提出了一种基于机器学习模型的信息提取方法,从非结构化病历文本中提取重要信息,其中Adaboost on Duration of Diabetes的提取效果最好(F1=0.97),而Family history of heart disease的提取效果最具挑战性,最佳模型结果SVM的F1值为0.73。其他6种信息提取模型的最佳性能在0.85 ~ 0.96之间,实际应用是可行的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Machine Learning Based Information Extraction for Diabetic Nephropathy in Clinical Text Documents
Diabetic nephropathy is common complication of diabetes mellitus, it's important to intervene early. For building a predictive model for diabetic nephropathy. In order to extract relevant information as a prediction risk factor, we construct a golden standard corpus. 3422 admission summary notes from 2013 to 2017 in a tertiary hospital were included in the study. An information extraction method based on machine learning models is proposed to extract important information from unstructured medical record texts, in which Adaboost on Duration of Diabetes has best performance (F1=0.97), and Family history of heart disease extraction is most challenge, F1 value of best model result SVM is 0.73. The best performance of the other six types of information extraction model is between 0.85 and 0.96, and the practical application is feasible.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信