Ahmet Öztürk MD , Serkan Günay MD , Serdal Ateş MD , Yavuz Yiğit (Yavuz Yigit) MD
{"title":"Can Gpt-4o Accurately Diagnose Trauma X-Rays? A Comparative Study with Expert Evaluations","authors":"Ahmet Öztürk MD , Serkan Günay MD , Serdal Ateş MD , Yavuz Yiğit (Yavuz Yigit) MD","doi":"10.1016/j.jemermed.2024.12.010","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>The latest artificial intelligence (AI) model, GPT-4o, introduced by OpenAI, can process visual data, presenting a novel opportunity for radiographic evaluation in trauma patients.</div></div><div><h3>Objective</h3><div>This study aimed to assess the efficacy of GPT-4o in interpreting radiographs for traumatic bone pathologies and to compare its performance with that of emergency medicine and orthopedic specialists.</div></div><div><h3>Methods</h3><div>The study involved 10 emergency medicine specialists, 10 orthopedic specialists, and the GPT-4o AI model, evaluating 25 cases of traumatic bone pathologies of the upper and lower extremities selected from the Radiopaedia website. Participants were asked to identify fractures or dislocations in the radiographs within 45 minutes. GPT-4o was instructed to perform the same task in 10 different chat sessions.</div></div><div><h3>Results</h3><div>Emergency medicine specialists and orthopedic specialists demonstrated an average accuracy of 82.8% and 87.2%, respectively, in radiograph interpretation. In contrast, GPT-4o achieved an accuracy of only 11.2%. Statistical analysis revealed significant differences among the three groups (<em>p</em> < 0.001), with GPT-4o performing significantly worse than both groups of specialists.</div></div><div><h3>Conclusion</h3><div>GPT-4o's ability to interpret radiographs of traumatic bone pathologies is currently limited and significantly inferior to that of trained specialists. These findings underscore the ongoing need for human expertise in trauma diagnosis and highlight the challenges of applying AI to complex medical imaging tasks.</div></div>","PeriodicalId":16085,"journal":{"name":"Journal of Emergency Medicine","volume":"73 ","pages":"Pages 71-79"},"PeriodicalIF":1.2000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Emergency Medicine","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0736467924004037","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"EMERGENCY MEDICINE","Score":null,"Total":0}
引用次数: 0
Abstract
Background
The latest artificial intelligence (AI) model, GPT-4o, introduced by OpenAI, can process visual data, presenting a novel opportunity for radiographic evaluation in trauma patients.
Objective
This study aimed to assess the efficacy of GPT-4o in interpreting radiographs for traumatic bone pathologies and to compare its performance with that of emergency medicine and orthopedic specialists.
Methods
The study involved 10 emergency medicine specialists, 10 orthopedic specialists, and the GPT-4o AI model, evaluating 25 cases of traumatic bone pathologies of the upper and lower extremities selected from the Radiopaedia website. Participants were asked to identify fractures or dislocations in the radiographs within 45 minutes. GPT-4o was instructed to perform the same task in 10 different chat sessions.
Results
Emergency medicine specialists and orthopedic specialists demonstrated an average accuracy of 82.8% and 87.2%, respectively, in radiograph interpretation. In contrast, GPT-4o achieved an accuracy of only 11.2%. Statistical analysis revealed significant differences among the three groups (p < 0.001), with GPT-4o performing significantly worse than both groups of specialists.
Conclusion
GPT-4o's ability to interpret radiographs of traumatic bone pathologies is currently limited and significantly inferior to that of trained specialists. These findings underscore the ongoing need for human expertise in trauma diagnosis and highlight the challenges of applying AI to complex medical imaging tasks.
期刊介绍:
The Journal of Emergency Medicine is an international, peer-reviewed publication featuring original contributions of interest to both the academic and practicing emergency physician. JEM, published monthly, contains research papers and clinical studies as well as articles focusing on the training of emergency physicians and on the practice of emergency medicine. The Journal features the following sections:
• Original Contributions
• Clinical Communications: Pediatric, Adult, OB/GYN
• Selected Topics: Toxicology, Prehospital Care, The Difficult Airway, Aeromedical Emergencies, Disaster Medicine, Cardiology Commentary, Emergency Radiology, Critical Care, Sports Medicine, Wound Care
• Techniques and Procedures
• Technical Tips
• Clinical Laboratory in Emergency Medicine
• Pharmacology in Emergency Medicine
• Case Presentations of the Harvard Emergency Medicine Residency
• Visual Diagnosis in Emergency Medicine
• Medical Classics
• Emergency Forum
• Editorial(s)
• Letters to the Editor
• Education
• Administration of Emergency Medicine
• International Emergency Medicine
• Computers in Emergency Medicine
• Violence: Recognition, Management, and Prevention
• Ethics
• Humanities and Medicine
• American Academy of Emergency Medicine
• AAEM Medical Student Forum
• Book and Other Media Reviews
• Calendar of Events
• Abstracts
• Trauma Reports
• Ultrasound in Emergency Medicine