Assessing the accuracy of ChatGPT in interpreting blood gas analysis results ChatGPT-4 in blood gas analysis

IF 5 2区医学 Q1 ANESTHESIOLOGY

Journal of Clinical Anesthesia Pub Date : 2025-02-21 DOI:10.1016/j.jclinane.2025.111787

Engin İhsan Turan , Abdurrahman Engin Baydemir , Anıl Berkay Balıtatlı , Ayça Sultan Şahin

{"title":"Assessing the accuracy of ChatGPT in interpreting blood gas analysis results ChatGPT-4 in blood gas analysis","authors":"Engin İhsan Turan , Abdurrahman Engin Baydemir , Anıl Berkay Balıtatlı , Ayça Sultan Şahin","doi":"10.1016/j.jclinane.2025.111787","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Arterial blood gas (ABG) analysis is a critical component of patient management in intensive care units (ICUs), operating rooms, and general wards, providing essential information on acid-base balance, oxygenation, and metabolic status. Interpretation requires a high level of expertise, potentially leading to variability in accuracy. This study explores the feasibility and accuracy of ChatGPT-4, an AI-based model, in interpreting ABG results compared to experienced anesthesiologists.</div></div><div><h3>Methods</h3><div>This prospective observational study, approved by the institutional ethics board, included 400 ABG samples from ICU patients, anonymized and assessed by ChatGPT-4. The model analyzed parameters including acid-base status, oxygenation, hemoglobin levels, and metabolic markers, and provided both diagnostic and treatment recommendations. Two anesthesiologists, trained in ABG interpretation, independently evaluated the model's predictions to determine accuracy in potential diagnoses and treatment.</div></div><div><h3>Results</h3><div>ChatGPT-4 achieved high accuracy across most ABG parameters, with 100 % accuracy for pH, oxygenation, sodium, and chloride. Hemoglobin accuracy was 92.5 %, while bilirubin interpretation showed limitations at 72.5 %. In several cases, the model recommended unnecessary bicarbonate treatment, suggesting an area for improvement in clinical judgment for acid-base balance management. The model's overall performance was statistically significant across most parameters (<em>p</em> < 0.05).</div></div><div><h3>Discussion</h3><div>ChatGPT-4 demonstrated potential as a supplementary tool for ABG interpretation in high-demand clinical settings, supporting rapid, reliable decision-making. However, the model's limitations in interpreting complex metabolic markers highlight the need for clinician oversight. Future refinements should focus on enhancing AI training for nuanced metabolic interpretation, particularly for markers like bilirubin, to ensure safe and effective application across diverse clinical contexts.</div></div>","PeriodicalId":15506,"journal":{"name":"Journal of Clinical Anesthesia","volume":"102 ","pages":"Article 111787"},"PeriodicalIF":5.0000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical Anesthesia","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952818025000479","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ANESTHESIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Background

Arterial blood gas (ABG) analysis is a critical component of patient management in intensive care units (ICUs), operating rooms, and general wards, providing essential information on acid-base balance, oxygenation, and metabolic status. Interpretation requires a high level of expertise, potentially leading to variability in accuracy. This study explores the feasibility and accuracy of ChatGPT-4, an AI-based model, in interpreting ABG results compared to experienced anesthesiologists.

Methods

This prospective observational study, approved by the institutional ethics board, included 400 ABG samples from ICU patients, anonymized and assessed by ChatGPT-4. The model analyzed parameters including acid-base status, oxygenation, hemoglobin levels, and metabolic markers, and provided both diagnostic and treatment recommendations. Two anesthesiologists, trained in ABG interpretation, independently evaluated the model's predictions to determine accuracy in potential diagnoses and treatment.

Results

ChatGPT-4 achieved high accuracy across most ABG parameters, with 100 % accuracy for pH, oxygenation, sodium, and chloride. Hemoglobin accuracy was 92.5 %, while bilirubin interpretation showed limitations at 72.5 %. In several cases, the model recommended unnecessary bicarbonate treatment, suggesting an area for improvement in clinical judgment for acid-base balance management. The model's overall performance was statistically significant across most parameters (p < 0.05).

Discussion

ChatGPT-4 demonstrated potential as a supplementary tool for ABG interpretation in high-demand clinical settings, supporting rapid, reliable decision-making. However, the model's limitations in interpreting complex metabolic markers highlight the need for clinician oversight. Future refinements should focus on enhancing AI training for nuanced metabolic interpretation, particularly for markers like bilirubin, to ensure safe and effective application across diverse clinical contexts.

查看原文本刊更多论文

评估ChatGPT在血气分析结果解释中的准确性

背景：物质血气（ABG）分析是重症监护病房（icu）、手术室和普通病房患者管理的重要组成部分，可提供有关酸碱平衡、氧合和代谢状态的重要信息。口译需要高水平的专业知识，这可能导致准确性的变化。与经验丰富的麻醉师相比，本研究探讨了ChatGPT-4（一种基于ai的模型）在解释ABG结果方面的可行性和准确性。方法本前瞻性观察性研究经机构伦理委员会批准，纳入来自ICU患者的400例ABG样本，匿名并通过ChatGPT-4进行评估。该模型分析了酸碱状态、氧合、血红蛋白水平和代谢指标等参数，并提供了诊断和治疗建议。两名接受过ABG解释培训的麻醉师独立评估了模型的预测，以确定潜在诊断和治疗的准确性。结果schatgpt -4在大多数ABG参数中具有较高的准确度，pH、氧合、钠和氯化物的准确度为100%。血红蛋白的准确率为92.5%，而胆红素的准确率为72.5%。在一些情况下，该模型推荐了不必要的碳酸氢盐治疗，这表明在酸碱平衡管理的临床判断方面有一个改进的领域。模型的整体性能在大多数参数上都具有统计学意义(p <；0.05)。chatgpt -4在高需求的临床环境中作为ABG解释的补充工具，具有支持快速、可靠决策的潜力。然而，该模型在解释复杂代谢标志物方面的局限性突出了临床医生监督的必要性。未来的改进应侧重于加强人工智能训练，以实现细微的代谢解释，特别是对胆红素等标志物的解释，以确保在不同的临床环境中安全有效地应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Clinical Anesthesia 医学-麻醉学

CiteScore

7.40

自引率

4.50%

发文量

346

审稿时长

23 days

期刊介绍： The Journal of Clinical Anesthesia (JCA) addresses all aspects of anesthesia practice, including anesthetic administration, pharmacokinetics, preoperative and postoperative considerations, coexisting disease and other complicating factors, cost issues, and similar concerns anesthesiologists contend with daily. Exceptionally high standards of presentation and accuracy are maintained. The core of the journal is original contributions on subjects relevant to clinical practice, and rigorously peer-reviewed. Highly respected international experts have joined together to form the Editorial Board, sharing their years of experience and clinical expertise. Specialized section editors cover the various subspecialties within the field. To keep your practical clinical skills current, the journal bridges the gap between the laboratory and the clinical practice of anesthesiology and critical care to clarify how new insights can improve daily practice.