{"title":"Enhancing Diabetes Management With CRIBC: A Novel NER Model for Constructing A Comprehensive Chinese Medical Knowledge Graph","authors":"Yiqing Xu, Zalizah Awang Long, Djoko Budiyanto Setyohadi","doi":"10.1002/eng2.70398","DOIUrl":null,"url":null,"abstract":"<p>This study proposes CRIBC, a novel Named Entity Recognition (NER) model tailored for Chinese medical texts, specifically focusing on diabetes-related data. By improving entity recognition accuracy, CRIBC facilitates the construction of a comprehensive knowledge graph to enhance diabetes research and clinical decision-making. CRIBC integrates Chinese-RoBERTa-WWM-EXT, IDCNN, BiLSTM, and CRF to optimize entity extraction. The model was trained on the DiaKG dataset and validated on the CMeEE dataset. Performance was evaluated using precision, recall, and F1-score. A diabetes knowledge graph was then constructed based on the extracted entities and relationships. CRIBC achieved an F1-score of 80.88% on the DiaKG dataset and 67.91% on the CMeEE dataset, outperforming baseline models. The constructed knowledge graph contains 23,134 nodes and 42,520 edges, providing structured insights into diabetes management, aiding clinical decision-making and medical research. CRIBC significantly enhances NER accuracy in Chinese medical texts, enabling efficient knowledge graph construction for diabetes management. Future research will focus on expanding datasets and refining the model's capabilities for broader medical applications.</p>","PeriodicalId":72922,"journal":{"name":"Engineering reports : open access","volume":"7 10","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/eng2.70398","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering reports : open access","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/eng2.70398","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
This study proposes CRIBC, a novel Named Entity Recognition (NER) model tailored for Chinese medical texts, specifically focusing on diabetes-related data. By improving entity recognition accuracy, CRIBC facilitates the construction of a comprehensive knowledge graph to enhance diabetes research and clinical decision-making. CRIBC integrates Chinese-RoBERTa-WWM-EXT, IDCNN, BiLSTM, and CRF to optimize entity extraction. The model was trained on the DiaKG dataset and validated on the CMeEE dataset. Performance was evaluated using precision, recall, and F1-score. A diabetes knowledge graph was then constructed based on the extracted entities and relationships. CRIBC achieved an F1-score of 80.88% on the DiaKG dataset and 67.91% on the CMeEE dataset, outperforming baseline models. The constructed knowledge graph contains 23,134 nodes and 42,520 edges, providing structured insights into diabetes management, aiding clinical decision-making and medical research. CRIBC significantly enhances NER accuracy in Chinese medical texts, enabling efficient knowledge graph construction for diabetes management. Future research will focus on expanding datasets and refining the model's capabilities for broader medical applications.