{"title":"“Terrible Stuff. We’ve been had“: hospital staff reactions to a new electronic health record and implications for employee well-being – A qualitative study","authors":"Eivind Sæthre , Solveig Osborg Ose , Steinar Krokstad , Sigmund Østgård Gismervik","doi":"10.1016/j.ijmedinf.2025.106039","DOIUrl":"10.1016/j.ijmedinf.2025.106039","url":null,"abstract":"<div><h3>Background</h3><div>Electronic Health Record (EHR) implementations significantly affect healthcare professionals’ work routines. Previous Epic implementations in Scandinavian hospitals have led to negative outcomes, highlighting the need for a thorough evaluation of employee experiences.</div></div><div><h3>Objective</h3><div>To qualitatively explore hospital employees’ experiences six months after Epic EHR implementation and assess implications for employee well-being and patient safety.</div></div><div><h3>Methods</h3><div>A qualitative study conducted at a Norwegian university hospital, six months post-implementation. Free-text responses from 950 employees (out of 2,115 survey respondents) were analyzed using reflexive thematic analysis within a phenomenological framework. Data were triangulated with focus group interviews and observational findings.</div></div><div><h3>Results</h3><div>Employees reported deep concerns and high emotional intensity post-implementation. Analysis revealed 13 themes, with <em>usability</em> being most prominent (n = 682 quotes). Participants described the system as cumbersome, inefficient, and counterintuitive. Other major themes included <em>work strain and own health</em> (n = 385), <em>administration of medicine and patient safety</em> (n = 201), and <em>politics and hospital management</em> (n = 184). Many employees experienced shifts in professional identity, with some expressing job abandonment intentions.</div></div><div><h3>Conclusions</h3><div>Poorly executed EHR implementations hinder professional performance, compromise patient care, and amplify emotional distress. Addressing workflow barriers and setting realistic expectations is critical for improving adoption. Future implementations must integrate employee perspectives and evidence-based strategies to ensure EHR systems enhance rather than obstruct healthcare delivery.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"204 ","pages":"Article 106039"},"PeriodicalIF":4.1,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144739189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Afeez Adekunle Soladoye , Nicholas Aderinto , Mayowa Racheal Popoola , Ibrahim A. Adeyanju , Ayokunle Osonuga , David B. Olawade
{"title":"Machine learning techniques for stroke prediction: A systematic review of algorithms, datasets, and regional gaps","authors":"Afeez Adekunle Soladoye , Nicholas Aderinto , Mayowa Racheal Popoola , Ibrahim A. Adeyanju , Ayokunle Osonuga , David B. Olawade","doi":"10.1016/j.ijmedinf.2025.106041","DOIUrl":"10.1016/j.ijmedinf.2025.106041","url":null,"abstract":"<div><h3>Background</h3><div>Stroke is a leading cause of mortality and disability worldwide, with approximately 15 million people suffering strokes annually. Machine learning (ML) techniques have emerged as powerful tools for stroke prediction, enabling early identification of risk factors through data-driven approaches. However, the clinical utility and performance characteristics of these approaches require systematic evaluation.</div></div><div><h3>Objectives</h3><div>To systematically review and analyze ML techniques used for stroke prediction, systematically synthesize performance metrics across different prediction targets and data sources, evaluate their clinical applicability, and identify research trends focusing on patient population characteristics and stroke prevalence patterns.</div></div><div><h3>Methods</h3><div>A systematic review was conducted following PRISMA guidelines. Five databases (Google Scholar, Lens, PubMed, ResearchGate, and Semantic Scholar) were searched for open-access publications on ML-based stroke prediction published between January 2013 and December 2024. Data were extracted on publication characteristics, datasets, ML methodologies, evaluation metrics, prediction targets (stroke occurrence vs. outcomes), data sources (EHR, imaging, biosignals), patient demographics, and stroke prevalence. Descriptive synthesis was performed due to substantial heterogeneity precluding quantitative meta-analysis.</div></div><div><h3>Results</h3><div>Fifty-eight studies were included, with peak publication output in 2021 (21 articles). Studies targeted three main prediction objectives: stroke occurrence prediction (n = 52, 62.7 %), stroke outcome prediction (n = 19, 22.9 %), and stroke type classification (n = 12, 14.4 %). Data sources included electronic health records (n = 48, 57.8 %), medical imaging (n = 21, 25.3 %), and biosignals (n = 14, 16.9 %). Systematic analysis revealed ensemble methods consistently achieved highest accuracies for stroke occurrence prediction (range: 90.4–97.8 %), while deep learning excelled in imaging-based applications. African populations, despite highest stroke mortality rates globally, were represented in fewer than 4 studies.</div></div><div><h3>Conclusion</h3><div>ML techniques show promising results for stroke prediction. However, significant gaps exist in representation of high-risk populations and real-world clinical validation. Future research should prioritize population-specific model development and clinical implementation frameworks.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"203 ","pages":"Article 106041"},"PeriodicalIF":3.7,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144604629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jennifer Sumner , Ravi Shankar , Anjali Bundele , Amanda Yap , Jaminah Mohamed Ali , Gim Gee Teng , Kee Fong Phang , Alexander Wenjun Yip , Yee Wei Lim
{"title":"Insights from high and low clinical users of telemedicine: a mixed-methods study of clinician workflows, sentiments, and user experiences","authors":"Jennifer Sumner , Ravi Shankar , Anjali Bundele , Amanda Yap , Jaminah Mohamed Ali , Gim Gee Teng , Kee Fong Phang , Alexander Wenjun Yip , Yee Wei Lim","doi":"10.1016/j.ijmedinf.2025.106044","DOIUrl":"10.1016/j.ijmedinf.2025.106044","url":null,"abstract":"<div><h3>Background</h3><div>Teleconsultation is a valuable tool in healthcare, but systematic evaluation of workflow processes (comparing teleconsultation to in-person visits) and the nuanced experiences of high and low clinical users of teleconsultation services is lacking. Understanding if and where differences exist is important to improve adoption and optimise service delivery. Our study objectives are: (1) To compare the process and workflow of teleconsultations and in-person consultations, identifying and quantifying if and where differences arise. (2) To examine clinicians’ experiences of teleconsultations, identifying barriers and enablers, and whether these differ between high and low providers.</div></div><div><h3>Methods</h3><div>We conducted a mixed-method study to explore workflow and clinician experiences with vCare (an outpatient chronic disease teleconsultation service run at Alexandra Hospital) versus in-person consultations. A time and motion study (n = 60 observations of individual consultations) quantified the task type and average duration for a teleconsultation and in-person visit. We also collected qualitative data (interviews and focus group discussions (n = 18)) from high and low-clinician users (physicians, nurses, and pharmacists) to understand clinical user experiences, barriers and enablers of vCare uptake. We defined low clinical users as clinicians who opted for vCare in less than <10 % of their monthly appointment time. Data were analysed using sentiment scoring and framework analysis, guided by the Consolidated Framework for Implementation Research (CFIR).</div></div><div><h3>Results</h3><div>Teleconsultation time was shorter; 10 mins versus 15 min (mean difference 4 min 42 s, p < 0.001, confidence interval (CI) 2 min 25 s, 7 min), with less time spent on history taking (mean difference 58 s, p = 0.01, CI 12.9, 103.1) and patient discussion (mean difference 1 min 34 s, p = 0.03, CI 7.1, 180.3) compared to in-person consultations. High clinical vCare users expressed more positive sentiments towards teleconsultation than low clinical users (composite sentiment score 0.1860 versus 0.1225), particularly in the CFIR Implementation Process domain (high user: 0.276 and low user: −0.099). Regarding barriers and enablers, high and low clinical users aligned on several factors, including the impact of infrastructure quality, suitability of patients, costs, policy, and stakeholder buy-in. Unique uptake barriers from high clinical users included liability concerns, language barriers and health literacy. For low clinical users, the need for reminders, fatigue from teleconsultations, and challenges in providing emotional support were influencing factors.</div></div><div><h3>Conclusion</h3><div>Integrating quantitative and qualitative data revealed key process differences between tele- and in-person consultations, as well as variations in clinician experience among high and low clinical users of teleconsultation. Developi","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"203 ","pages":"Article 106044"},"PeriodicalIF":3.7,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144633067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hannah Ball , Emily Eisner , Jennifer Nicholas , Paul Wilson , Sandra Bucci
{"title":"Staff expectations for the implementation of digital remote monitoring in services for people with psychosis: A qualitative study using normalisation process theory","authors":"Hannah Ball , Emily Eisner , Jennifer Nicholas , Paul Wilson , Sandra Bucci","doi":"10.1016/j.ijmedinf.2025.106040","DOIUrl":"10.1016/j.ijmedinf.2025.106040","url":null,"abstract":"<div><h3>Background</h3><div>Digital remote monitoring (DRM) utilises devices such as smartphones and wearables to remotely collect health-related data, providing insights into the mental health of individuals with psychosis. This data can be shared with mental health services to aid clinical assessment. DRM has been found to effectively identify early signs of psychosis relapse, enabling clinicians to intervene earlier and improve outcomes for service users. However, there are challenges to its implementation in services. This study used Normalisation Process Theory (NPT) as a framework to examine mental health professionals’ expectations regarding the barriers and facilitators to implementing DRM in psychosis care.</div></div><div><h3>Methods</h3><div>Semi-structured interviews were conducted with 59 multi-disciplinary mental health professionals from nine UK National Health Service mental health Trusts/Health Boards. Interviews were inductively thematically analysed, then deductively analysed by mapping themes to the core constructs of NPT.</div></div><div><h3>Findings</h3><div>Findings were similar across all settings and applicable to three NPT constructs (coherence, cognitive participation and collective action) and their subcomponents. One inductive theme, ‘own experiences of technology’ was not captured by NPT. Participants understood DRM’s purpose for detecting early signs of relapse. However, several barriers to implementation were identified: uncertainty about professional roles, resource issues, concerns about inaccurate DRM data, complexity of the technology, security/privacy issues, and concerns about using DRM with certain clinical presentations. Suggested implementation strategies included staff training and ongoing technical support, developing guidance regarding professionals’ responsibilities, using an in-house ‘DRM expert’ to lead its integration within services, enhancing clinician’s knowledge of the evidence base for DRM in psychosis care, and actively involving both clinicians and service users in DRM system development.</div><div>Interpretation</div><div>Findings identify key factors and actionable implementation strategies essential for successful early adoption of DRM in routine care. By addressing these considerations, implementation effectiveness can be optimised, ultimately improving outcomes for people with psychosis.</div><div>Funding</div><div>Wellcome Trust, National Institute for Health and Care Research.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"203 ","pages":"Article 106040"},"PeriodicalIF":3.7,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144604510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chen Chen , Yingying Hu , Wenwei Cai , Huibin Pan , Meihong Shen , Yujie Zhai , Shanhui Wu , Qunyi Zhou , Yi Guo
{"title":"Deep learning-based in-ambulance speech recognition and generation of prehospital emergency diagnostic summaries using LLMs","authors":"Chen Chen , Yingying Hu , Wenwei Cai , Huibin Pan , Meihong Shen , Yujie Zhai , Shanhui Wu , Qunyi Zhou , Yi Guo","doi":"10.1016/j.ijmedinf.2025.106029","DOIUrl":"10.1016/j.ijmedinf.2025.106029","url":null,"abstract":"<div><h3>Objective</h3><div>The timely and accurate submission of prehospital electronic medical records is crucial for the efficiency of medical rescue operations. However, personnel professional experience, training cycles, and environmental conditions often influence its completion rate. This study proposes integrating noise-robust speech recognition technology with large language models (LLMs) to generate emergency diagnosis summaries. This approach aims to help medical personnel quickly document key patient information, streamlining the emergency response process.</div></div><div><h3>Methods</h3><div>A joint training model combining speech enhancement and recognition was proposed, incorporating LLMs to generate emergency diagnosis summaries. The model was trained in two rounds using actual ambulance noise data, environmental noise data, and open-source speech datasets. The model optimized Connectionist Temporal Classification(CTC) and attention loss through deep feature extraction and the selective attention mechanism. The study also analyzed the impact of different prompt designs on the quality of LLMs-generated summaries. Tukey HSD and Holm correction methods were employed for multiple comparisons of three subjective evaluation metrics under three prompts for three models, assessing the statistical significance of each factor’s influence on the generation results.</div></div><div><h3>Results</h3><div>The proposed speech recognition model reduced the character error rate in real-world ambulance noise recordings to 52.92%, outperforming several comparative speech recognition models. Under the Stylized Prompt condition, the Qwen2.5-7B-Instruct model demonstrated superior accuracy and relevance compared to other models in terms of subjectivity and relevance, reducing the average completion time for prehospital electronic medical records from 20 min to 14 min.</div></div><div><h3>Conclusion</h3><div>Using noise-robust speech recognition combined with LLMs to generate emergency diagnosis summaries improves efficiency and enhances medical record completion. This approach demonstrates broad application potential in emergencies and could be extended to quality evaluation, disease prediction, and risk assessment.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"203 ","pages":"Article 106029"},"PeriodicalIF":3.7,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144580749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yi Wang , Jing Lei , Zhiping Jin , Ying Jiang , Ningping Zhang , Minzhi Lv , Tianshu Liu
{"title":"Development and validation of a machine learning-based clinical prediction model for monitoring liver injury in patients with pan-cancer receiving immunotherapy","authors":"Yi Wang , Jing Lei , Zhiping Jin , Ying Jiang , Ningping Zhang , Minzhi Lv , Tianshu Liu","doi":"10.1016/j.ijmedinf.2025.106036","DOIUrl":"10.1016/j.ijmedinf.2025.106036","url":null,"abstract":"<div><h3>Background</h3><div>Immune checkpoint inhibitor (ICI)-related liver injury poses a considerable clinical challenge for cancer patients. This study aimed to develop and validate an interpretable predictive model employing machine learning (ML) algorithms to accurately identify patients at high risk of acute liver injury within one month of initiating ICI treatment.</div></div><div><h3>Methods</h3><div>This longitudinal cohort study included pan-cancer patients who received their first ICI treatment between March 2019 and September 2022 at Zhongshan Hospital. Six ML algorithms, namely neural networks (NN), gradient boosting classifier (GBC), eXtreme gradient boosting (XGBoost), logistic regression (LR), categorical boosting classifier (CatBoost) and random forest (RF), were utilized to construct predictive models for acute ICI-related liver injury. Model performance was evaluated using the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and brier score (BS). The SHapley Additive exPlanations (SHAP) method was applied to rank the feature importance and interpret the final model, providing insights into the contribution of each feature to liver injury prediction, thereby enhancing clinical interpretability. This study is registered with the Chinese Clinical Trial Registry (ChiCTR2300067470).</div></div><div><h3>Results</h3><div>A total of 863 patients were enrolled in the study, with 22.71% experiencing liver injury within one month of ICI initiation. Among the six preliminary models, the RF model exhibited the best performance and was selected for the development of the final model. The SHAP method was utilized to rank variables from the six pre-models, with 10 variables selected for the final model by identifying the intersection of the top 20 most important variables across these models. The final RF model exhibited robust performance, achieving an AUC of 0.81 (95% CI: 0.73–0.90) on the test set, and 0.79 (95% CI: 0.72–0.88) and 0.80 (95% CI: 0.72–0.89) in the 5-fold and 10-fold cross-validation, respectively. The Decision Curve Analysis (DCA) curve illustrated solid clinical benefit, and the calibration curve reflected good predictive consistency.</div></div><div><h3>Conclusion</h3><div>An interpretable RF model was developed to predict acute liver injury occurring within one month after ICI treatment. This clinical-friendly model enables early identification of high-risk patients, facilitating optimized clinical management and ultimately improving treatment outcomes.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"203 ","pages":"Article 106036"},"PeriodicalIF":3.7,"publicationDate":"2025-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144580747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F.A.C.J. Heijsters , E.C. Cornelissen , M.C. de Bruijne , M. Bouman , M.G. Mullender , F. van Nassau
{"title":"Process evaluation of the implementation of a personalized digital care pathway tool using the RE-AIM framework","authors":"F.A.C.J. Heijsters , E.C. Cornelissen , M.C. de Bruijne , M. Bouman , M.G. Mullender , F. van Nassau","doi":"10.1016/j.ijmedinf.2025.106032","DOIUrl":"10.1016/j.ijmedinf.2025.106032","url":null,"abstract":"<div><h3>Objective</h3><div>This study evaluated the implementation of a Personalized Digital Care Pathway (PDCP) tool in the context of three different patient groups (i.e. scar clinic, cleft care and gender-affirming care). We assessed the reach of patient users, information provision perceived by patients (effectiveness), adoption by healthcare professionals, implementation in practice including perceived satisfaction, and sustainable implementation of the PDCP-tool.</div></div><div><h3>Materials & Methods</h3><div>A process evaluation was conducted according to the Reach, Effectiveness, Adoption, Implementation, and Maintenance (RE-AIM) framework using mixed methods. Data collection included patient questionnaires administered before (n = 139) and after (n = 68) implementation, tool user statistics, and in-depth interviews (n = 16) and focus groups (n = 4) with patients and healthcare professionals, supplemented by researcher field notes.</div></div><div><h3>Results</h3><div>Most patients using the tool were digitally literate and had previously used other hospital applications. Patients felt well-informed after using the PDCP-tool, but some expressed little added value. For adoption by healthcare professionals, their involvement during development and perceived added value of the tool were essential. For implementation, it was important to have a user-friendly tool, that is integrated with existing systems and meets the information needs of patients. Ongoing financial and technical support by the healthcare organization is needed for sustainable implementation.</div></div><div><h3>Conclusion</h3><div>The PDCP-tool was found to be of value in providing appropriate information, particularly to digital and language proficient, information-hungry patient groups. This process evaluation showed many promising elements for implementation of a digital tool in a large healthcare organization. However, achieving and sustaining the value of the tool required considerable efforts during development and implementation, as it necessitated continuous commitment to keep it in the focus of patients, healthcare professionals and the organization. Future studies should explore the effectiveness of integrating such a tool into widely used hospital information systems, ensuring it becomes an integral part of the healthcare workflow, rather than a stand-alone solution.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"204 ","pages":"Article 106032"},"PeriodicalIF":3.7,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144685657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xuan Wu , Xuecheng Yao , Jianing Shi , Mengling Tang , Qingli Zhou , Kun Chen
{"title":"Development and validation of a machine learning model for early screening of high-risk mild cognitive impairment from the multi-cohort data","authors":"Xuan Wu , Xuecheng Yao , Jianing Shi , Mengling Tang , Qingli Zhou , Kun Chen","doi":"10.1016/j.ijmedinf.2025.106030","DOIUrl":"10.1016/j.ijmedinf.2025.106030","url":null,"abstract":"<div><h3>Background</h3><div>Early screening of mild cognitive impairment (MCI) in older populations is crucial for timely intervention. MCI often precedes dementia, but current diagnostic tools are time-consuming and not widely accessible. Utilizing basic physical examination data may enable earlier, more practical screening.</div></div><div><h3>Methods</h3><div>Data from the China Health and Retirement Longitudinal Study (CHARLS) 2015 were used to develop the model. Two external datasets from CHARLS 2011 and Yiwu 2021 cohorts were used for validation. A total of 34 variables were considered, including demographics, health conditions, lifestyle, and physical and blood examination data. The Mini-Mental State Examination (MMSE) was used for MCI diagnosis. Seven key variables (education, grip strength, height, weight, creatinine, mean corpuscular volume, and platelet count) were selected through majority voting. Five machine learning models were evaluated, and a Random Forest (RF) model was chosen based on its superior performance.</div></div><div><h3>Results</h3><div>The model demonstrated high diagnostic performance with a sensitivity of 0.906, specificity of 0.850, and accuracy of 85.5%. The area under the receiver operating characteristic curve (AUROC) was 0.93, and the area under the precision-recall curve (AUPRC) was 0.93. In the external validation, AUROCs of 0.83 and 0.87 were achieved. The model was enhanced with an explainable method and deployed via a Streamlit-based web application.</div></div><div><h3>Conclusions</h3><div>This study successfully developed machine learning-based models for early MCI screening in older populations via basic physical examination data and MCI risk prediction through a web calculator (<span><span>https://mciscreening.streamlit.app/</span><svg><path></path></svg></span>), both demonstrating favorable performance, generalizability, and effective clinical implementation.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"203 ","pages":"Article 106030"},"PeriodicalIF":3.7,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144580748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jing Li , Hong Yang , Yu Zhang , Jingjing Yan , Jing Tian , Yanbo Zhang
{"title":"Developing a multi-label learning model to predict major adverse cardiovascular events in patients with unstable angina pectoris: A prospective cohort study","authors":"Jing Li , Hong Yang , Yu Zhang , Jingjing Yan , Jing Tian , Yanbo Zhang","doi":"10.1016/j.ijmedinf.2025.106037","DOIUrl":"10.1016/j.ijmedinf.2025.106037","url":null,"abstract":"<div><h3>Background</h3><div>Major adverse cardiovascular events (MACE) represent critical endpoints in cardiovascular research. The occurrence of MACE in patients with unstable angina pectoris (UAP) exhibits multidimensional complexity. We employed multi-label learning (MLL) models to concurrently predict five distinct types of MACE.</div></div><div><h3>Methods</h3><div>This prospective observational cohort study analysed the 978 UAP patients from the Second Affiliated Hospital of Shanxi Medical University (Taiyuan, China) between July 1, 2017, and June 30, 2019. Three-year follow-up endpoints encompassed all-cause death, heart failure, stroke, myocardial infarction, and revascularization. We utilized ReliefF for Multi-label Feature Selection (RFML), Mutual Information-based Feature Selection (MIFS), and Scalable Criteria for Large label Set (SCLS) to identify significant prognostic variables. Nineteen MLL models were implemented, including Binary Relevance (BR), Classifier Chains (CC), Label Powerset (LP), Random k-Labelsets (RAkEL), Multi-label k-Nearest Neighbor, Twin Support Vector Machine to Multi-label Learning, and Wrapping multi-label learning with label-specific features generation. BR, CC, LP, and RAkEL models were constructed using four base classifiers: Decision Tree, Random Forest, Extreme Gradient Boosting, and Light Gradient Boosting Machine. Performance evaluation incorporated 12 different metrics.</div></div><div><h3>Results</h3><div>The RFML, MIFS, and SCLS respectively screened 18, 12, and 14 important features, and the MLL prediction performance based on RFML selected features was the best. Among the MLL models, RAkEL with Random Forest as the base classifier demonstrated superior predictive performance, achieving an Accuracy of 0.575 <span><math><mrow><mo>±</mo></mrow></math></span> 0.022, Precision of 0.646 <span><math><mrow><mo>±</mo></mrow></math></span> 0.029, Hamming loss of 0.159 <span><math><mrow><mo>±</mo></mrow></math></span> 0.008, One error of 0.425 <span><math><mrow><mo>±</mo></mrow></math></span> 0.022, Macro_F1 of 0.719 <span><math><mrow><mo>±</mo></mrow></math></span> 0.028, Micro_F1 of 0.740 <span><math><mrow><mo>±</mo></mrow></math></span> 0.011, Macro_AUC of 0.786 <span><math><mrow><mo>±</mo></mrow></math></span> 0.031, Micro_AUC of 0.806 <span><math><mrow><mo>±</mo></mrow></math></span> 0.030 and Multi-Brier Score of 0.115 <span><math><mrow><mo>±</mo></mrow></math></span> 0.035.</div></div><div><h3>Conclusions</h3><div>The RAkEL model with Random Forest as the base classifier significantly enhanced predictive accuracy for MACE in UAP patients. This approach provides a more comprehensive risk assessment, enabling clinicians to develop personalized treatment strategies and improve patient outcomes.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"203 ","pages":"Article 106037"},"PeriodicalIF":3.7,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144571213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"From text to data: Open-source large language models in extracting cancer related medical attributes from German pathology reports","authors":"Stefan Bartels, Jasmin Carus","doi":"10.1016/j.ijmedinf.2025.106022","DOIUrl":"10.1016/j.ijmedinf.2025.106022","url":null,"abstract":"<div><div>Structured oncological documentation is vital for data-driven cancer care, yet extracting clinical features from unstructured pathology reports remains challenging—especially in German healthcare, where strict data protection rules require local model deployment. This study evaluates open-source large language models (LLMs) for extracting oncological attributes from German pathology reports in a secure, on-premise setting. We created a gold-standard dataset of 522 annotated reports and developed a retrieval-augmented generation (RAG) pipeline using an additional 15,000 pathology reports. Five instruction-tuned LLMs (Llama 3.3 70B, Mistral Small 24B, and three SauerkrautLM variants) were evaluated using three prompting strategies: zero-shot, few-shot, and RAG-enhanced few-shot prompting. All models produced structured JSON outputs and were assessed using entity-level precision, recall, accuracy, and macro-averaged F1-score. Results show that Llama 3.3 70B achieved the highest overall performance (F1 > 0.90). However, when combined with the RAG pipeline, Mistral Small 24B achieved nearly equivalent performance, matching Llama 70B on most entity types while requiring significantly fewer computational resources. Prompting strategy significantly impacted performance: few-shot prompting improved baseline accuracy, and RAG further enhanced performance, particularly for models with fewer than 24B parameters. Challenges remained in extracting less frequent but clinically critical attributes like metastasis and staging, underscoring the importance of retrieval mechanisms and balanced training data. This study demonstrates that open-source LLMs, when paired with effective prompting and retrieval strategies, can enable high-quality, privacy-compliant extraction of oncological information from unstructured text. The finding that smaller models can match larger ones through retrieval augmentation highlights a path toward scalable, resource-efficient deployment in German clinical settings.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"203 ","pages":"Article 106022"},"PeriodicalIF":3.7,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144523053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}