Anne Bryde Alnor , Rasmus Bank Lynggaard , Martin Sundahl Laursen , Pernille Just Vinholt
{"title":"Natural language processing for identifying major bleeding risk in hospitalised medical patients","authors":"Anne Bryde Alnor , Rasmus Bank Lynggaard , Martin Sundahl Laursen , Pernille Just Vinholt","doi":"10.1016/j.compbiomed.2025.110093","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Major bleeding is a severe complication in critically ill medical patients, resulting in significant morbidity, mortality, and healthcare costs. This study aims to assess the incidence and risk factors for major bleeding in hospitalised medical patients using a Natural Language Processing (NLP) model.</div></div><div><h3>Methods</h3><div>We conducted a retrospective, cross-sectional observational study using electronic health records of adult patients admitted through the Emergency Department at Odense University Hospital from January 2017 to December 2022. Major bleeding during admission was identified and validated using a natural language model, with events classified according to current guidelines. Risk factors, including demographics, comorbidities, and biochemical values at admission, were evaluated. Two risk assessment models (RAMs) were developed using Cox proportional hazards regression. Validation included, bootstrapping, K-fold cross validation, and cluster analyses.</div></div><div><h3>Results</h3><div>Of the 46,439 eligible patients, 1246 (2.7 %) experienced major bleeding. Risk factors for major bleeding included older age, male sex, alcohol consumption, higher systolic blood pressure, lower haemoglobin, and higher creatinine. RAM 1, which included biochemical data and comorbidities, demonstrated robust predictive performance (Harrell's C-statistic = 0.726). RAM 2, a simplified model without comorbidities, maintained similar predictive accuracy (C-statistic = 0.721), indicating its potential utility in clinical settings with limited resources for detailed patient histories. Results were consistent throughout validation.</div></div><div><h3>Conclusion</h3><div>This study highlights the incidence and risk factors of major bleeding in medical patients, emphasizing the predictive value of routinely measured biochemical markers. Furthermore, it shows the applicability of NLP models in identifying bleeding episodes in EHR text.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"190 ","pages":"Article 110093"},"PeriodicalIF":7.0000,"publicationDate":"2025-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010482525004445","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Major bleeding is a severe complication in critically ill medical patients, resulting in significant morbidity, mortality, and healthcare costs. This study aims to assess the incidence and risk factors for major bleeding in hospitalised medical patients using a Natural Language Processing (NLP) model.
Methods
We conducted a retrospective, cross-sectional observational study using electronic health records of adult patients admitted through the Emergency Department at Odense University Hospital from January 2017 to December 2022. Major bleeding during admission was identified and validated using a natural language model, with events classified according to current guidelines. Risk factors, including demographics, comorbidities, and biochemical values at admission, were evaluated. Two risk assessment models (RAMs) were developed using Cox proportional hazards regression. Validation included, bootstrapping, K-fold cross validation, and cluster analyses.
Results
Of the 46,439 eligible patients, 1246 (2.7 %) experienced major bleeding. Risk factors for major bleeding included older age, male sex, alcohol consumption, higher systolic blood pressure, lower haemoglobin, and higher creatinine. RAM 1, which included biochemical data and comorbidities, demonstrated robust predictive performance (Harrell's C-statistic = 0.726). RAM 2, a simplified model without comorbidities, maintained similar predictive accuracy (C-statistic = 0.721), indicating its potential utility in clinical settings with limited resources for detailed patient histories. Results were consistent throughout validation.
Conclusion
This study highlights the incidence and risk factors of major bleeding in medical patients, emphasizing the predictive value of routinely measured biochemical markers. Furthermore, it shows the applicability of NLP models in identifying bleeding episodes in EHR text.
期刊介绍:
Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.