Applied Clinical Informatics最新文献

筛选
英文 中文
Comparing Large Language Models' Performances on Otolaryngology Knowledge Assessment Questions. 比较大型语言模型在耳鼻喉科知识评估问题上的表现。
IF 2.2 2区 医学
Applied Clinical Informatics Pub Date : 2026-03-01 Epub Date: 2026-04-06 DOI: 10.1055/a-2835-4634
Ryan Cook, Abner Kahan, Thomas Scharfenberger, Jason Tasoulas, Noah Hawks-Ladds, Robert Chouake, Sunit P Jariwala, Shitij Arora
{"title":"Comparing Large Language Models' Performances on Otolaryngology Knowledge Assessment Questions.","authors":"Ryan Cook, Abner Kahan, Thomas Scharfenberger, Jason Tasoulas, Noah Hawks-Ladds, Robert Chouake, Sunit P Jariwala, Shitij Arora","doi":"10.1055/a-2835-4634","DOIUrl":"10.1055/a-2835-4634","url":null,"abstract":"<p><p>This study evaluates the performance of multiple large language models (LLMs) on specialized otolaryngology knowledge, comparing OpenAI's GPT-4 Turbo with 10 commercially available models to assess their potential utility in otolaryngology medical education.A total of 1,075 questions from OTO QUEST, the official self-assessment resource of the American Academy of Otolaryngology-Head and Neck Surgery, were administered to GPT-4 Turbo using a zero-shot approach. Accuracy was analyzed using logistic regression, adjusting for question difficulty, year, and subspecialty. Performance was subsequently compared with 10 other commercial models (including Claude-3.5-Sonnet, Gemini-1.5-Pro, and GPT-4o) using the 1,075 question dataset, evaluated with Cochran's Q test (<i>p</i> < 0.001) and McNemar's pairwise comparison.GPT-4 Turbo achieved an overall accuracy of 72.09% (95% confidence interval [CI]: 69.3-74.7%) across 1,075 questions. It performed best in Practice Management questions (odds ratio [OR] = 3.93, 95% CI: 1.12-13.73, <i>p</i> = 0.032) and declined in accuracy when faced with questions of moderate and hard difficulty (OR = 0.21, 95% CI: 0.16-0.29, <i>p</i> < 0.001 and OR = 0.04, 95% CI: 0.01-0.10, <i>p</i> < 0.001, respectively). In comparative analysis, Grok-3 ranked highest with 76.3% accuracy (95% CI: 73.6-78.7%), followed by Claude-3.5-sonnet (73.0%, 95% CI: 70.3-75.6%) and GPT-4o (69.9%, 95% CI: 67.1-72.5%), with GPT-4 Turbo application programming interface ranking fourth.This comprehensive model comparison reveals that while major commercial LLMs show promising capabilities in specialized medical knowledge assessment, they demonstrate an apparent accuracy plateau around 73 to 76%. These findings suggest current general-purpose LLMs may require specialized training approaches to advance beyond this performance threshold in medical domains.</p>","PeriodicalId":48956,"journal":{"name":"Applied Clinical Informatics","volume":"17 2","pages":"194-203"},"PeriodicalIF":2.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13053123/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147628655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluation of ChatGPT and Gemini in Answering Patient Questions after Gynecologic Surgery. ChatGPT和Gemini在妇科手术后回答患者问题中的评价。
IF 2.2 2区 医学
Applied Clinical Informatics Pub Date : 2026-03-01 Epub Date: 2026-02-24 DOI: 10.1055/a-2818-1611
Petra C Voigt, Rhea S Sharma, Angela Chaudhari, Susan Tsai, Magdy P Milad, Linda C Yang
{"title":"Evaluation of ChatGPT and Gemini in Answering Patient Questions after Gynecologic Surgery.","authors":"Petra C Voigt, Rhea S Sharma, Angela Chaudhari, Susan Tsai, Magdy P Milad, Linda C Yang","doi":"10.1055/a-2818-1611","DOIUrl":"10.1055/a-2818-1611","url":null,"abstract":"<p><p>This study aimed to explore the performance of ChatGPT version 4.0 (GPT-4) and Gemini Advanced (Gemini) large language models (LLMs) in addressing common patient questions after gynecology surgery with regards to accuracy, relevance, helpfulness to the average patient, and readability.In this cross-sectional study, the two LLMs were prompted to generate answers to postoperative patient questions after gynecologic surgery. Postoperative patient questions were developed to simulate common patient questions after gynecologic surgery, based on expert opinion and compiled from anonymous posters on Reddit (r/endometriosis). Questions were focused on six topics: endometriosis, vaginal bleeding, bowel/bladder function, incision care, resumption of activities, and sexual function. Questions were asked in a systematic three-step submission process with the memory reset after each query. Responses were then blinded and independently assessed for accuracy and relevance on a 5-Point Likert scale by four board-certified gynecologic surgeons with fellowship training in gynecologic surgery. Readability of the answers was calculated with the Flesch Kincaid grade level calculator. Responses were also evaluated by three clinic nurses.A total of 41 questions were posed to GPT-4 and Gemini three times. The responses were independently evaluated by four surgeons and three nurses leading to a total of 1,968 evaluations for accuracy, relevance, helpfulness to the average patient, and readability. Surgeons and nurses graded Gemini responses as more accurate (4.23 vs. 4.03, <i>p</i> = 0.015) and helpful (4.37 vs. 4.21, <i>p</i> = 0.025) than GPT-4 responses. Responses from both models were similarly found to be relevant or very relevant (4.45 vs. 4.36, <i>p</i> = 0.2). Most responses by GPT-4 (85%) and Gemini (87%) were consistent across all questions. The average reading level for GPT-4 and Gemini responses were 11th and 10th grade, above the recommended 6th grade reading level for patient information.GPT-4 and Gemini provided overall accurate, relevant, and helpful responses to common postoperative patient questions for gynecologic surgery. Gemini outperformed GPT-4 in both accuracy and helpfulness and had objectively more readable responses.</p>","PeriodicalId":48956,"journal":{"name":"Applied Clinical Informatics","volume":" ","pages":"172-176"},"PeriodicalIF":2.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13035415/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147285686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How DISCERNing is ChatGPT? An Evaluation of Models and Prompt Engineering in Assessing Patient Education Materials. ChatGPT的辨识能力如何?评估病人教育材料的模型与即时工程。
IF 2.2 2区 医学
Applied Clinical Informatics Pub Date : 2026-03-01 Epub Date: 2026-04-23 DOI: 10.1055/a-2853-8892
Justin Blackman, Danica Friesen, Markus Bernardus Sikkel, Richard Veerapen
{"title":"How DISCERNing is ChatGPT? An Evaluation of Models and Prompt Engineering in Assessing Patient Education Materials.","authors":"Justin Blackman, Danica Friesen, Markus Bernardus Sikkel, Richard Veerapen","doi":"10.1055/a-2853-8892","DOIUrl":"https://doi.org/10.1055/a-2853-8892","url":null,"abstract":"<p><strong>Objectives: </strong>The objective of this study is to evaluate whether ChatGPT models can reliably apply the DISCERN instrument, a 16-question human-scored rubric developed in 1999 to evaluate consumer health information, and assess the impact of prompting strategies, model choice, and scoring repeatability on agreement with human-derived DISCERN scores.</p><p><strong>Methods: </strong>A PubMed search of \"DISCERN\" identified English-language studies since 2019 reporting exact webpage URLs with corresponding human-derived DISCERN scores. Archived versions of 42 webpages were retrieved. Three ChatGPT models (GPT-5.2, GPT-4o, and o3) were evaluated using four prompting strategies: \"Naïve\" zero-shot, item-level \"Split\" scoring, \"Augmented\" prompting with DISCERN guidance, and a \"Combined\" split-plus-augmented approach. Agreement with human scores was assessed using correlations and absolute differences. Repeatability was examined using 10 repeated scoring runs across 9 webpages.</p><p><strong>Results: </strong>Agreement between ChatGPT-generated and human DISCERN scores was weak to moderate. All models demonstrated systematic score compression, overestimating low-quality webpages and underestimating high-quality webpages. Combined prompting modestly improved agreement and reduced absolute error, particularly for the o3 model, which consistently outperformed GPT-5.2 and GPT-4o. Substantial run-to-run variability was observed with a mean score range of 17.5 points and ranges up to 43 points for the same webpage. Averaging scores across runs did not improve agreement with human ratings. ChatGPT's DISCERN scoring reflects systematic attenuation consistent with prediction under noisy subjective measurement. Prompt engineering did not correct calibration bias or reproducibility limitations.</p><p><strong>Conclusion: </strong>Under the prompting strategies evaluated, ChatGPT models were insufficient for reliable automated DISCERN scoring. Persistent attenuation bias and poor repeatability significantly limit clinical or research applicability.</p>","PeriodicalId":48956,"journal":{"name":"Applied Clinical Informatics","volume":"17 2","pages":"274-280"},"PeriodicalIF":2.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147786726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
From Pilot to Practice: A Sociotechnical Perspective for Sustainable Adoption of Patient Engagement Technologies. 从试点到实践:可持续采用患者参与技术的社会技术视角。
IF 2.2 2区 医学
Applied Clinical Informatics Pub Date : 2026-03-01 Epub Date: 2026-04-21 DOI: 10.1055/a-2843-8762
Prashila Dullabh, Courtney Zott, Nicole Gauthreaux, Abigail Aronoff, Dean F Sittig, Aziz Boxwala
{"title":"From Pilot to Practice: A Sociotechnical Perspective for Sustainable Adoption of Patient Engagement Technologies.","authors":"Prashila Dullabh, Courtney Zott, Nicole Gauthreaux, Abigail Aronoff, Dean F Sittig, Aziz Boxwala","doi":"10.1055/a-2843-8762","DOIUrl":"https://doi.org/10.1055/a-2843-8762","url":null,"abstract":"<p><strong>Background: </strong>Despite widespread investment in patient engagement technologies-such as mobile apps, chatbots, and remote monitoring tools-few have achieved sustained adoption or integration into clinical workflows. The persistent gap between pilot success and real-world scalability reflects not only technical barriers but also sociotechnical challenges involving people, processes, and policy.</p><p><strong>Objectives: </strong>This study aimed to identify cross-cutting barriers and enablers of implementation across multiple real-world pilots of patient engagement technologies and extend the Sociotechnical Model (STM) of Health Information Technology (IT) to explicitly incorporate patient perspectives and lived experiences as determinants of adoption and sustainability.</p><p><strong>Methods: </strong>Drawing on our team's formal evaluations of implementing patient engagement technologies across four U.S. health systems-including applications for coronavirus disease 2019, hypertension, medication adherence, and chatbot-supported communication-we synthesized lessons learned across eight STM domains: hardware/software, clinical content, human-computer interface, people, workflow and communication, organizational policies, external pressures, and system monitoring.</p><p><strong>Results: </strong>Eight cross-cutting lessons emerged: (1) effective leadership and collaboration across clinicians, IT and informatics teams, patients, and electronic health record and app developers are essential; (2) uneven standards adoption and support continues to limit interoperability; (3) success depends on skilled technical resources with expertise in interoperability standards; (4) engage patients in codesign processes early and throughout; (5) sustained patient engagement requires structured onboarding and feedback loops; (6) account for diverse patient needs and preferences during the design; (7) clinician workflows must be redesigned to integrate and act on patient-contributed data without increasing burden; and (8) demonstrated return on investment is needed to justify long-term costs of maintenance.</p><p><strong>Conclusion: </strong>Sustaining patient engagement technologies requires expanding the sociotechnical lens to include patients' lifeflows alongside organizational and technical factors. Future implementation, research, and policy efforts should focus on collaborative leadership models, patient-centered engagement processes, enhanced interoperability, clear data monitoring workflows, and financial sustainability.</p>","PeriodicalId":48956,"journal":{"name":"Applied Clinical Informatics","volume":"17 2","pages":"237-245"},"PeriodicalIF":2.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13099257/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147786735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Erratum on Pagination Correction: Industry Electives in Clinical Informatics Fellowship: Early Experiences from a Multi-Institution Survey. 关于页码更正的勘误:临床信息学奖学金的行业选修课:来自多机构调查的早期经验。
IF 2.2 2区 医学
Applied Clinical Informatics Pub Date : 2026-03-01 Epub Date: 2026-05-06 DOI: 10.1055/a-2843-2793
Nicholas Genes, Priyanka Solanki, Joseph Kannry, Raman Khanna, Dara Mize, Veena Lingam, Robert W Turer, Michael G Leu
{"title":"Erratum on Pagination Correction: Industry Electives in Clinical Informatics Fellowship: Early Experiences from a Multi-Institution Survey.","authors":"Nicholas Genes, Priyanka Solanki, Joseph Kannry, Raman Khanna, Dara Mize, Veena Lingam, Robert W Turer, Michael G Leu","doi":"10.1055/a-2843-2793","DOIUrl":"10.1055/a-2843-2793","url":null,"abstract":"","PeriodicalId":48956,"journal":{"name":"Applied Clinical Informatics","volume":"17 2","pages":"e1"},"PeriodicalIF":2.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13148862/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147844957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Clinical Utility of Traditional and Machine Learning Alarms during the Care of Acutely Ill Patients. 传统报警和机器学习报警在急性病人护理中的临床应用。
IF 2.2 2区 医学
Applied Clinical Informatics Pub Date : 2026-01-01 Epub Date: 2026-03-03 DOI: 10.1055/a-2815-1912
Nicole Rosario, Henry M Mitchell, Sylvia Zhang, Nandakumar Selvaraj, Xiaozhu Zhang, Carme Hernandez, Stuart R Lipsitz, David M Levine
{"title":"The Clinical Utility of Traditional and Machine Learning Alarms during the Care of Acutely Ill Patients.","authors":"Nicole Rosario, Henry M Mitchell, Sylvia Zhang, Nandakumar Selvaraj, Xiaozhu Zhang, Carme Hernandez, Stuart R Lipsitz, David M Levine","doi":"10.1055/a-2815-1912","DOIUrl":"10.1055/a-2815-1912","url":null,"abstract":"<p><p>Despite low-level evidence, acutely ill patients are often continuously monitored. This creates high false alarm rates and alarm fatigue with unclear clinical effectiveness. We compare metrics, including alarm burden, area under the receiver operator characteristic curve (auROC), sensitivity, and specificity for threshold, score (i.e., National Early Warning Score [NEWS]), and machine learning (ML) alarms.We retrospectively annotated continuous biometric data for acutely ill patients receiving hospital care at home for clinical utility (change in clinical management) or a safety composite using the electronic health record. Threshold alarms for heart rate (HR), respiratory rate (RR), and fall were set pragmatically by clinical teams; the score alarm was the NEWS, and the ML alarm was an unsupervised ML algorithm that detected anomalies in HR, RR, and activity. Our primary outcome was alarm burden (alarms/patient-hour). Secondary outcomes included alarm performance.We studied 526 patients of median age 71 (interquartile range [IQR]: 25), 60.3% female, 45.1% White. Compared with threshold alarms (0.132 alarms/patient-hour), alarm burden was lower with score and ML alarms (0.005 score alarms/patient-hour; 0.032 ML alarms/patient-hour; <i>p</i> < 0.001 for both, compared with threshold). The positive predictive value for identifying clinical utility was 0.073 for threshold, 0.247 for score, and 0.181 for ML. The auROC for identifying the safety composite was 0.557 for threshold, 0.578 for score, and 0.656 for ML.Score and ML alarms decreased alarm burden with higher overall performance in recognizing clinically important events. Our findings suggest that the use of score or ML alarms holds promise in reducing alarm fatigue while improving recognition of clinically important events, although all alarms require improvement.</p>","PeriodicalId":48956,"journal":{"name":"Applied Clinical Informatics","volume":"17 1","pages":"107-117"},"PeriodicalIF":2.2,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12956398/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147349561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Corrigendum: Measurement Properties of Instruments Assessing Digital Competence in Nursing: A Systematic Review. 勘误:评估护理数字化能力的仪器测量特性:系统回顾。
IF 2.2 2区 医学
Applied Clinical Informatics Pub Date : 2026-01-01 Epub Date: 2026-03-19 DOI: 10.1055/a-2815-8240
Fabio D'Agostino, Ilaria Erba, Elske Ammenwerth, Vered Robinzon, Gad Segal, Nissim Harel, Elisabetta Corvo, Refael Barkan, Hadas Lewy, Noemi Giannetta
{"title":"Corrigendum: Measurement Properties of Instruments Assessing Digital Competence in Nursing: A Systematic Review.","authors":"Fabio D'Agostino, Ilaria Erba, Elske Ammenwerth, Vered Robinzon, Gad Segal, Nissim Harel, Elisabetta Corvo, Refael Barkan, Hadas Lewy, Noemi Giannetta","doi":"10.1055/a-2815-8240","DOIUrl":"10.1055/a-2815-8240","url":null,"abstract":"","PeriodicalId":48956,"journal":{"name":"Applied Clinical Informatics","volume":"17 1","pages":"e1-e3"},"PeriodicalIF":2.2,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13002285/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147488087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimizing HIV Care Engagement: Usability of a mHealth App for Identifying and Retaining Individuals with Nonviral Suppression in Digital Cohort. 优化艾滋病毒护理参与:移动健康应用程序在数字队列中识别和保留非病毒抑制个体的可用性。
IF 2.2 2区 医学
Applied Clinical Informatics Pub Date : 2026-01-01 Epub Date: 2026-01-30 DOI: 10.1055/a-2786-0291
Fabiana Cristina Dos Santos, Sophia Mclnerney, Miya C Tate, Aadia Rana, D Scott Batey, Rebecca Schnall
{"title":"Optimizing HIV Care Engagement: Usability of a mHealth App for Identifying and Retaining Individuals with Nonviral Suppression in Digital Cohort.","authors":"Fabiana Cristina Dos Santos, Sophia Mclnerney, Miya C Tate, Aadia Rana, D Scott Batey, Rebecca Schnall","doi":"10.1055/a-2786-0291","DOIUrl":"10.1055/a-2786-0291","url":null,"abstract":"<p><p>Drive to Zero is a mobile health application (app) designed to identify and retain people with HIV (PWH) who have experienced challenges with achieving or maintaining viral suppression. The app targets PWH who have lacked documented HIV care in the past months and are experiencing medication adherence barriers. Features include an interactive chat for communicating with the study team and access to educational resources to support care engagement and health management.This usability study aimed to assess the Drive to Zero app's ease of use and interface design through expert heuristic evaluation and end-user testing.Usability was evaluated through two approaches: heuristic evaluations conducted by five informatics experts following Nielsen's usability principles, and end-user testing with 20 PWH using the validated Post-Study System Usability Questionnaire and qualitative interviews to collect feedback on app functionality and user experience.Heuristic experts and end-users demonstrated satisfaction with the app's appearance, reporting that it has a simple and intuitive interface for identifying and retaining PWH, which will assist them with study engagement and ultimately reengage with HIV care. However, participants highlighted areas needing improvement, suggesting better accessibility of \"home\" and \"help\" buttons to improve user control and a more detailed explanation of the incentive program to enhance user engagement and retention.Usability evaluations provided valuable insights into the Drive to Zero app's design. Areas for improvement were enhancing user controls and improving the readability of the incentive program. These findings will guide iterative refinements, ensuring that future versions of the app improve the usability and acceptability of its target audience.</p>","PeriodicalId":48956,"journal":{"name":"Applied Clinical Informatics","volume":"17 1","pages":"39-45"},"PeriodicalIF":2.2,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12858313/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146094591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Standardizing Data Elements for Implementation of ICU Liberation Bundle. 实现ICU解放包的数据元素标准化。
IF 2.2 2区 医学
Applied Clinical Informatics Pub Date : 2026-01-01 Epub Date: 2026-02-03 DOI: 10.1055/a-2802-7458
Md Fantacher Islam, Molly Douglas, Jarrod Mosier, Vignesh Subbian
{"title":"Standardizing Data Elements for Implementation of ICU Liberation Bundle.","authors":"Md Fantacher Islam, Molly Douglas, Jarrod Mosier, Vignesh Subbian","doi":"10.1055/a-2802-7458","DOIUrl":"10.1055/a-2802-7458","url":null,"abstract":"<p><p>Getting patients out of intensive care units (ICUs) is a major goal for acute care clinicians, as prolonged stays increase the risk of complications and strain critical resources such as staff, equipment, and beds. The ICU Liberation Bundle, or the ABCDEF (A-F) care bundle, is an evidence-based framework for improving outcomes in critically ill patients by addressing pain, sedation, delirium, mobility, and family engagement. However, variability in documentation and a lack of standardized data elements hinder effective implementation and evaluation of adherence to bundle components.This study aims to characterize data elements of the A-F liberation bundle using a large, single-center critical care database and to develop standardized bundle cards that map bundle components to controlled vocabularies.We conducted a retrospective analysis of data elements related to the A-F bundle using the MIMIC-IV database. Clinical concepts were mapped to standardized vocabularies and aligned with the Observational Medical Outcomes Partnership (OMOP) common data model (CDM). Bundle cards were developed for each component to provide structured, accessible documentation of assessment tools, adherence criteria, and terminology mappings.Pain assessments were documented in over 11,000 patients, with a median of 23 assessments per day. Sedation levels for nearly 59,000 patients were evaluated, with 37.7% meeting the Society of Critical Care Medicine (SCCM) adherence criteria. Delirium assessments followed standardized protocols incorporating Richmond Agitation-Sedation Scale (RASS) and CAM-ICU scores. Components E and F lacked formal compliance specifications; bundle cards for these components identified key activities and highlighted gaps in standardized vocabularies. Adherence analyses revealed variability likely due to non-standardized documentation practices.We developed and validated six ICU Liberation Bundle cards that map bundle components to standardized vocabularies and CDMs, enabling retrospective adherence evaluation in real-world data. These information resources promote consistent documentation, support interoperability, and provide a foundation for prospective monitoring to enhance bundle implementation in critical care.</p>","PeriodicalId":48956,"journal":{"name":"Applied Clinical Informatics","volume":" ","pages":"52-59"},"PeriodicalIF":2.2,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12900566/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146114501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Measuring the Accuracy and Reproducibility of DeepSeek R1, Claude 3.5 Sonnet, and GPT-4.1 on Complex Clinical Scenarios. 测量DeepSeek R1、Claude 3.5 Sonnet和GPT‑4.1在复杂临床情况下的准确性和可重复性。
IF 2.2 2区 医学
Applied Clinical Informatics Pub Date : 2026-01-01 Epub Date: 2026-02-09 DOI: 10.1055/a-2807-4256
Robert E Hoyt, Maria Bajwa
{"title":"Measuring the Accuracy and Reproducibility of DeepSeek R1, Claude 3.5 Sonnet, and GPT-4.1 on Complex Clinical Scenarios.","authors":"Robert E Hoyt, Maria Bajwa","doi":"10.1055/a-2807-4256","DOIUrl":"10.1055/a-2807-4256","url":null,"abstract":"<p><p>The integration of large language models (LLMs) into clinical diagnostics presents significant challenges regarding their accuracy and reliability.This study aimed to evaluate the performance of DeepSeek R1, an open-source reasoning model, alongside two other LLMs, GPT-4.1 and Claude 3.5 Sonnet, across multiple-choice clinical cases.A dataset of complex medical cases representative of real-world clinical practice was selected.For efficiency, models were accessed via application programming interfaces (APIs) and assessed using standardized prompts and a predefined evaluation protocol.The models demonstrated an overall accuracy of 77.1%, with GPT-4 producing the fewest errors and Claude 3.5 the most. The reproducibility analysis indicated that the tests were very repeatable: DeepSeek (100%), GPT-4.1 (97.5%), and Claude 3.5 Sonnet (92%).While LLMs show promise for enhancing diagnostics, ongoing scrutiny is required to address error rates and validate standard medical answers. Given the limited dataset and prompting protocol, findings should not be interpreted as broader equivalence in real-world clinical reasoning. This study demonstrates the need for robust evaluation standards, attention to error rates, and further research.</p>","PeriodicalId":48956,"journal":{"name":"Applied Clinical Informatics","volume":" ","pages":"64-72"},"PeriodicalIF":2.2,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12923312/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146151041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书