Traditional Chinese Medicine knowledge Service based on Semi-Supervised BERT-BiLSTM-CRF Model

Mingzhu Zhang, Zhongguo Yang, Chen Liu, Lei Fang
{"title":"Traditional Chinese Medicine knowledge Service based on Semi-Supervised BERT-BiLSTM-CRF Model","authors":"Mingzhu Zhang, Zhongguo Yang, Chen Liu, Lei Fang","doi":"10.1109/ICSS50103.2020.00018","DOIUrl":null,"url":null,"abstract":"Most of Traditional Chinese Medicine (TCM) data and ancient records exist in the form of books. The unstructured medical information is the foundation for building TCM knowledge service. The existing methods are not accurate enough to solve TCM named entity recognition and require a lot of manual labeling data. This paper proposes a semi-supervised embedded Semi-BERT-BiLSTM-CRF model. Based on the book “Diagnosis of Traditional Chinese Medicine in Traditional Chinese Medicine”, we select the physical features from the cleaned-up text information according to the characteristics of Chinese medicine classics, and then use a small amount of labeled data to train the BERT-BiLSTM-CRF model. The obtained model is used to predict unlabeled data and obtain pseudo-label data. The pseudo-label and labeled data are used as a training set for model training. Experiments show that TCM entity recognition accuracy of this method reaches 81.24%, which effectively improves the TCM entity recognition accuracy and reduces the manual labeling work. The results of this research can be applied to scenarios such as auxiliary diagnosis of TCM and expert system after subsequent improvement and transformation.","PeriodicalId":292795,"journal":{"name":"2020 International Conference on Service Science (ICSS)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Service Science (ICSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSS50103.2020.00018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Most of Traditional Chinese Medicine (TCM) data and ancient records exist in the form of books. The unstructured medical information is the foundation for building TCM knowledge service. The existing methods are not accurate enough to solve TCM named entity recognition and require a lot of manual labeling data. This paper proposes a semi-supervised embedded Semi-BERT-BiLSTM-CRF model. Based on the book “Diagnosis of Traditional Chinese Medicine in Traditional Chinese Medicine”, we select the physical features from the cleaned-up text information according to the characteristics of Chinese medicine classics, and then use a small amount of labeled data to train the BERT-BiLSTM-CRF model. The obtained model is used to predict unlabeled data and obtain pseudo-label data. The pseudo-label and labeled data are used as a training set for model training. Experiments show that TCM entity recognition accuracy of this method reaches 81.24%, which effectively improves the TCM entity recognition accuracy and reduces the manual labeling work. The results of this research can be applied to scenarios such as auxiliary diagnosis of TCM and expert system after subsequent improvement and transformation.
基于半监督BERT-BiLSTM-CRF模型的中医知识服务
中医资料和古籍大多以书籍的形式存在。非结构化医学信息是构建中医知识服务的基础。现有的方法在解决中药命名实体识别问题上准确率不高,且需要大量的人工标注数据。提出了一种半监督嵌入式Semi-BERT-BiLSTM-CRF模型。我们以《中医诊断》一书为基础,根据中医经典的特征,从清理后的文本信息中选择物理特征,然后使用少量标记数据训练BERT-BiLSTM-CRF模型。得到的模型用于预测无标签数据和获得伪标签数据。伪标签和标记数据作为训练集用于模型训练。实验表明,该方法的中药实体识别准确率达到81.24%,有效提高了中药实体识别准确率,减少了人工标注工作。本研究成果可在后续改进改造后应用于中医辅助诊断、专家系统等场景。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信