The application of natural language processing technology in hospital network information management systems: Potential for improving diagnostic accuracy and efficiency
{"title":"The application of natural language processing technology in hospital network information management systems: Potential for improving diagnostic accuracy and efficiency","authors":"Shiyong Wang , Hong Luo","doi":"10.1016/j.slast.2025.100287","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Processing scanned documents in electronic health records (EHR) was one of the problem in hospital network information management systems (HNIMS). To overcome this difficulty, the complex interactions among natural language processing (NLP), optical character recognition (OCR) and image preprocessing was used.</div></div><div><h3>Objective</h3><div>The goal is to investigate the possibilities of improving diagnostic efficiency and accuracy in healthcare settings by using NLP technologies into HNIMS. These individuals received diagnoses for a wide range of sleep problems. The data collected were converted into scanned PDF images which were then preprocessed by using gray scaling and OCR. Bag of Words (BoW) is used to extract the featured data.</div></div><div><h3>Method</h3><div>Reports are divided among 70 % training and 30 % test sets for NLP model evaluation. By employing a hidden Bayesian technique on the development set, we suggest a novel hidden Bayesian integrated dense Bi-LSTM (HB-DBi-LSTM) strategy for optimizing bag-of-words models. A 6:1 ratio is further separated for training and validation sets in deep learning-based sequence models because of their high computing requirements. After 100 epochs of Adam optimization, the dense Bi-LSTM model is trained.</div></div><div><h3>Result</h3><div>The models are evaluated assessed at the segment level for AHI and SaO2 for ROC and AUROC on test sets. In the finding assessment phase, the detection capacity of the suggested model is evaluated using many criteria, such as F1-score (0.9637), accuracy (0.9321), recall (0.9421) and precision (0.9532). To evaluate information extraction, a document-level examination is also carried out.</div></div><div><h3>Conclusion</h3><div>To improve diagnostic speed and accuracy, especially when handling scanned documents in EHR, it emphasizes the critical need for strong natural language processing (NLP) systems inside HNIMS.</div></div>","PeriodicalId":54248,"journal":{"name":"SLAS Technology","volume":"32 ","pages":"Article 100287"},"PeriodicalIF":2.5000,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SLAS Technology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2472630325000457","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Processing scanned documents in electronic health records (EHR) was one of the problem in hospital network information management systems (HNIMS). To overcome this difficulty, the complex interactions among natural language processing (NLP), optical character recognition (OCR) and image preprocessing was used.
Objective
The goal is to investigate the possibilities of improving diagnostic efficiency and accuracy in healthcare settings by using NLP technologies into HNIMS. These individuals received diagnoses for a wide range of sleep problems. The data collected were converted into scanned PDF images which were then preprocessed by using gray scaling and OCR. Bag of Words (BoW) is used to extract the featured data.
Method
Reports are divided among 70 % training and 30 % test sets for NLP model evaluation. By employing a hidden Bayesian technique on the development set, we suggest a novel hidden Bayesian integrated dense Bi-LSTM (HB-DBi-LSTM) strategy for optimizing bag-of-words models. A 6:1 ratio is further separated for training and validation sets in deep learning-based sequence models because of their high computing requirements. After 100 epochs of Adam optimization, the dense Bi-LSTM model is trained.
Result
The models are evaluated assessed at the segment level for AHI and SaO2 for ROC and AUROC on test sets. In the finding assessment phase, the detection capacity of the suggested model is evaluated using many criteria, such as F1-score (0.9637), accuracy (0.9321), recall (0.9421) and precision (0.9532). To evaluate information extraction, a document-level examination is also carried out.
Conclusion
To improve diagnostic speed and accuracy, especially when handling scanned documents in EHR, it emphasizes the critical need for strong natural language processing (NLP) systems inside HNIMS.
背景:电子病历扫描文件的处理是医院网络信息管理系统(HNIMS)的难点之一。为了克服这一困难,使用了自然语言处理(NLP)、光学字符识别(OCR)和图像预处理之间复杂的相互作用。目的:目的是研究在HNIMS中使用NLP技术提高医疗机构诊断效率和准确性的可能性。这些人被诊断出患有各种各样的睡眠问题。将采集到的数据转换为扫描后的PDF图像,然后进行灰度化和OCR预处理。使用BoW (Bag of Words)提取特征数据。方法:报告分为70%的训练集和30%的测试集进行NLP模型评价。通过在开发集上使用隐贝叶斯技术,我们提出了一种新的隐贝叶斯集成密集Bi-LSTM (HB-DBi-LSTM)策略来优化词袋模型。在基于深度学习的序列模型中,由于训练集和验证集的计算量要求较高,因此进一步将其分割为6:1的比例。经过100次Adam优化,训练出密集的Bi-LSTM模型。结果:在测试集上对模型的AHI和SaO2进行了分段水平的ROC和AUROC评估。在发现评估阶段,使用f1分数(0.9637)、准确率(0.9321)、召回率(0.9421)和精度(0.9532)等多个标准对建议模型的检测能力进行评估。为了评估信息提取,还进行了文档级检查。结论:为了提高诊断速度和准确性,特别是在电子病历中处理扫描文档时,它强调了HNIMS内部强大的自然语言处理(NLP)系统的迫切需要。
期刊介绍:
SLAS Technology emphasizes scientific and technical advances that enable and improve life sciences research and development; drug-delivery; diagnostics; biomedical and molecular imaging; and personalized and precision medicine. This includes high-throughput and other laboratory automation technologies; micro/nanotechnologies; analytical, separation and quantitative techniques; synthetic chemistry and biology; informatics (data analysis, statistics, bio, genomic and chemoinformatics); and more.