Comparison between artificial intelligence solution and radiologist for the detection of pelvic, hip and extremity fractures on radiographs in adult using CT as standard of reference

IF 8.1 2区医学 Q1 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

Diagnostic and Interventional Imaging Pub Date : 2025-01-01 DOI:10.1016/j.diii.2024.09.004

Maxime Pastor , Djamel Dabli , Raphaël Lonjon , Chris Serrand , Fehmi Snene , Fayssal Trad , Fabien de Oliveira , Jean-Paul Beregi , Joël Greffier

{"title":"Comparison between artificial intelligence solution and radiologist for the detection of pelvic, hip and extremity fractures on radiographs in adult using CT as standard of reference","authors":"Maxime Pastor , Djamel Dabli , Raphaël Lonjon , Chris Serrand , Fehmi Snene , Fayssal Trad , Fabien de Oliveira , Jean-Paul Beregi , Joël Greffier","doi":"10.1016/j.diii.2024.09.004","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><div>The purpose of this study was to compare the diagnostic performance of an artificial intelligence (AI) solution for the detection of fractures of pelvic, proximal femur or extremity fractures in adults with radiologist interpretation of radiographs, using standard dose CT examination as the standard of reference.</div></div><div><h3>Materials and methods</h3><div>This retrospective study included 94 adult patients with suspected bone fractures who underwent a standard dose CT examination and radiographs of the pelvis and/or hip and extremities at our institution between January 2022 and August 2023. For all patients, an AI solution was used retrospectively on the radiographs to detect and localize bone fractures of the pelvis and/or hip and extremities. Results of the AI solution were compared to the reading of each radiograph by a radiologist using McNemar test. The results of standard dose CT examination as interpreted by a senior radiologist were used as the standard of reference.</div></div><div><h3>Result</h3><div>A total of 94 patients (63 women; mean age, 56.4 ± 22.5 [standard deviation] years) were included. Forty-seven patients had at least one fracture, and a total of 71 fractures were deemed present using the standard of reference (25 hand/wrist, 16 pelvis, 30 foot/ankle). Using the standard of reference, the analysis of radiographs by the AI solution resulted in 58 true positive, 13 false negative, 33 true negative and 15 false positive findings, yielding 82 % sensitivity (58/71; 95 % confidence interval [CI]: 71–89 %), 69 % specificity (33/48; 95 % CI: 55–80 %), and 76 % accuracy (91/119; 95 % CI: 69–84 %). Using the standard of reference, the reading of the radiologist resulted in 65 true positive, 6 false negative, 42 true negative and 6 false positive findings, yielding 92 % sensitivity (65/71; 95 % CI: 82–96 %), 88 % specificity (42/48; 95 % CI: 75–94 %), and 90 % accuracy (107/119; 95 % CI: 85–95 %). The radiologist outperformed the AI solution in terms of sensitivity (<em>P</em> = 0.045), specificity (<em>P</em> = 0.016), and accuracy (<em>P</em> < 0.001).</div></div><div><h3>Conclusion</h3><div>In this study, the radiologist outperformed the AI solution for the diagnosis of pelvic, hip and extremity fractures of the using radiographs. This raises the question of whether a strong standard of reference for evaluating AI solutions should be used in future studies comparing AI and human reading in fracture detection using radiographs.</div></div>","PeriodicalId":48656,"journal":{"name":"Diagnostic and Interventional Imaging","volume":"106 1","pages":"Pages 22-27"},"PeriodicalIF":8.1000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Diagnostic and Interventional Imaging","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2211568424001979","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose

The purpose of this study was to compare the diagnostic performance of an artificial intelligence (AI) solution for the detection of fractures of pelvic, proximal femur or extremity fractures in adults with radiologist interpretation of radiographs, using standard dose CT examination as the standard of reference.

Materials and methods

This retrospective study included 94 adult patients with suspected bone fractures who underwent a standard dose CT examination and radiographs of the pelvis and/or hip and extremities at our institution between January 2022 and August 2023. For all patients, an AI solution was used retrospectively on the radiographs to detect and localize bone fractures of the pelvis and/or hip and extremities. Results of the AI solution were compared to the reading of each radiograph by a radiologist using McNemar test. The results of standard dose CT examination as interpreted by a senior radiologist were used as the standard of reference.

Result

A total of 94 patients (63 women; mean age, 56.4 ± 22.5 [standard deviation] years) were included. Forty-seven patients had at least one fracture, and a total of 71 fractures were deemed present using the standard of reference (25 hand/wrist, 16 pelvis, 30 foot/ankle). Using the standard of reference, the analysis of radiographs by the AI solution resulted in 58 true positive, 13 false negative, 33 true negative and 15 false positive findings, yielding 82 % sensitivity (58/71; 95 % confidence interval [CI]: 71–89 %), 69 % specificity (33/48; 95 % CI: 55–80 %), and 76 % accuracy (91/119; 95 % CI: 69–84 %). Using the standard of reference, the reading of the radiologist resulted in 65 true positive, 6 false negative, 42 true negative and 6 false positive findings, yielding 92 % sensitivity (65/71; 95 % CI: 82–96 %), 88 % specificity (42/48; 95 % CI: 75–94 %), and 90 % accuracy (107/119; 95 % CI: 85–95 %). The radiologist outperformed the AI solution in terms of sensitivity (P = 0.045), specificity (P = 0.016), and accuracy (P < 0.001).

Conclusion

In this study, the radiologist outperformed the AI solution for the diagnosis of pelvic, hip and extremity fractures of the using radiographs. This raises the question of whether a strong standard of reference for evaluating AI solutions should be used in future studies comparing AI and human reading in fracture detection using radiographs.

查看原文本刊更多论文

以 CT 为参考标准，比较人工智能解决方案和放射科医生对成人骨盆、髋部和四肢骨折放射影像的检测结果。

目的：本研究的目的是以标准剂量CT检查为参考标准，比较人工智能（AI）解决方案在检测成人骨盆、股骨近端或四肢骨折方面的诊断性能与放射科医生对X光片的判读：这项回顾性研究纳入了2022年1月至2023年8月期间在我院接受标准剂量CT检查和骨盆和/或髋关节及四肢X光片检查的94例疑似骨折成人患者。对所有患者的射线照片都采用了人工智能解决方案，以检测和定位骨盆和/或髋关节及四肢骨折。使用 McNemar 检验将人工智能解决方案的结果与放射科医生对每张射线照片的判读结果进行比较。由资深放射科医生解读的标准剂量 CT 检查结果作为参考标准：共纳入 94 名患者（63 名女性；平均年龄为 56.4 ± 22.5 [标准差]岁）。47 名患者至少有一处骨折，根据参考标准，共有 71 处骨折被认定为存在（25 处手部/腕部骨折、16 处骨盆骨折、30 处足部/踝部骨折）。使用参考标准，通过人工智能解决方案对射线照片进行分析，得出 58 个真阳性、13 个假阴性、33 个真阴性和 15 个假阳性结果，灵敏度为 82%（58/71；95% 置信区间 [CI]：71-89%），特异度为 69%（33/48；95% CI：55-80%），准确度为 76%（91/119；95% CI：69-84%）。使用参考标准时，放射科医生的读片结果为 65 个真阳性、6 个假阴性、42 个真阴性和 6 个假阳性，灵敏度为 92%（65/71；95% CI：82-96%），特异度为 88%（42/48；95% CI：75-94%），准确度为 90%（107/119；95% CI：85-95%）。在灵敏度（P = 0.045）、特异性（P = 0.016）和准确性（P < 0.001）方面，放射科医生的表现优于人工智能解决方案：结论：在本研究中，放射科医生在使用放射影像诊断骨盆、髋部和四肢骨折方面的表现优于人工智能解决方案。这就提出了一个问题，即在今后比较人工智能和人类读片在使用射线照片检测骨折方面的效果时，是否应该使用一个强有力的参考标准来评估人工智能解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Diagnostic and Interventional Imaging Medicine-Radiology, Nuclear Medicine and Imaging

CiteScore

8.50

自引率

29.10%

发文量

126

审稿时长

11 days

期刊介绍： Diagnostic and Interventional Imaging accepts publications originating from any part of the world based only on their scientific merit. The Journal focuses on illustrated articles with great iconographic topics and aims at aiding sharpening clinical decision-making skills as well as following high research topics. All articles are published in English. Diagnostic and Interventional Imaging publishes editorials, technical notes, letters, original and review articles on abdominal, breast, cancer, cardiac, emergency, forensic medicine, head and neck, musculoskeletal, gastrointestinal, genitourinary, interventional, obstetric, pediatric, thoracic and vascular imaging, neuroradiology, nuclear medicine, as well as contrast material, computer developments, health policies and practice, and medical physics relevant to imaging.