Open-source convolutional neural network to classify distal radial fractures according to the AO/OTA classification on plain radiographs.

European journal of trauma and emergency surgery : official publication of the European Trauma Society Pub Date : 2025-07-21 DOI:10.1007/s00068-025-02931-6

Koen D Oude Nijhuis, Jasper Prijs, Britt Barvelink, Hans van Luit, Yang Zhao, Zhibin Liao, Ruurd L Jaarsma, Frank F A IJpma, Mathieu M E Wijffels, Job N Doornberg, Joost W Colaris

{"title":"Open-source convolutional neural network to classify distal radial fractures according to the AO/OTA classification on plain radiographs.","authors":"Koen D Oude Nijhuis, Jasper Prijs, Britt Barvelink, Hans van Luit, Yang Zhao, Zhibin Liao, Ruurd L Jaarsma, Frank F A IJpma, Mathieu M E Wijffels, Job N Doornberg, Joost W Colaris","doi":"10.1007/s00068-025-02931-6","DOIUrl":null,"url":null,"abstract":"Purpose: Convolutional Neural Networks (CNNs) have shown promise in fracture detection, but their ability to improve surgeons' inconsistent fracture classification remains unstudied. Therefore, our aim was create and (externally) validate the performance of an open-source CNN algorithm to classify DRFs according to the AO/OTA classification system?Methods: Patients with postero-anterior, lateral and oblique radiographs were included. Radiographs were classified according to the AO/OTA-classification and were used to train a CNN algorithm. The algorithm was tested on an internal and external validation set (two other level 1 trauma centers), with the DRFs classified by three independent surgeons.Results: 659 radiographs were used to train the algorithm. Internal- and external validation sets contained 190 and 188 patients, respectively. Upon internal validation, the CNN had an accuracy of 62% and an area under receiving operating characteristic curve (AUC) of 0.63-0.93 (type 2R3A 0.84, type 2R3B 0.63, type 2R3C 0.75, and no DRF 0.93). On the external validation, the algorithm has an accuracy of 61% and an AUC of 0.56-0.88 (type 2R3A 0.82, type 2R3B 0.56, type 2R3C 0.75, and no DRF 0.88).Conclusion: The presented algorithm has demonstrated excellent accuracy in classifying type 2R3A DRFs and excluding DRFs. However, poor to moderate accuracy is observed in classifying 2R3B and 2R3C DRFs according to the AO/OTA system, similar to limited surgeons' inter-observer agreement. These results show that despite previous excellence in fracture detection, CNN-algorithms struggle with classifying; potentially showing the inherent problems with these classification systems.","PeriodicalId":520620,"journal":{"name":"European journal of trauma and emergency surgery : official publication of the European Trauma Society","volume":"51 1","pages":"261"},"PeriodicalIF":0.0000,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12279608/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European journal of trauma and emergency surgery : official publication of the European Trauma Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00068-025-02931-6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose: Convolutional Neural Networks (CNNs) have shown promise in fracture detection, but their ability to improve surgeons' inconsistent fracture classification remains unstudied. Therefore, our aim was create and (externally) validate the performance of an open-source CNN algorithm to classify DRFs according to the AO/OTA classification system?

Methods: Patients with postero-anterior, lateral and oblique radiographs were included. Radiographs were classified according to the AO/OTA-classification and were used to train a CNN algorithm. The algorithm was tested on an internal and external validation set (two other level 1 trauma centers), with the DRFs classified by three independent surgeons.

Results: 659 radiographs were used to train the algorithm. Internal- and external validation sets contained 190 and 188 patients, respectively. Upon internal validation, the CNN had an accuracy of 62% and an area under receiving operating characteristic curve (AUC) of 0.63-0.93 (type 2R3A 0.84, type 2R3B 0.63, type 2R3C 0.75, and no DRF 0.93). On the external validation, the algorithm has an accuracy of 61% and an AUC of 0.56-0.88 (type 2R3A 0.82, type 2R3B 0.56, type 2R3C 0.75, and no DRF 0.88).

Conclusion: The presented algorithm has demonstrated excellent accuracy in classifying type 2R3A DRFs and excluding DRFs. However, poor to moderate accuracy is observed in classifying 2R3B and 2R3C DRFs according to the AO/OTA system, similar to limited surgeons' inter-observer agreement. These results show that despite previous excellence in fracture detection, CNN-algorithms struggle with classifying; potentially showing the inherent problems with these classification systems.

查看原文本刊更多论文

基于开源卷积神经网络，根据平片AO/OTA分类对桡骨远端骨折进行分类。

目的：卷积神经网络（cnn）在骨折检测方面已经显示出前景，但其改善外科医生不一致的骨折分类的能力仍未得到研究。因此，我们的目标是创建并（外部）验证一个开源CNN算法的性能，该算法根据AO/OTA分类系统对drf进行分类。方法：采用后前位、侧位和斜位x线片对患者进行分析。根据AO/ ota分类对x线照片进行分类，并用于训练CNN算法。该算法在内部和外部验证集（另外两个一级创伤中心）上进行测试，drf由三位独立的外科医生分类。结果：使用659张x线片训练算法。内部和外部验证集分别包含190和188名患者。经内部验证，CNN的准确率为62%，接收工作特征曲线下面积（AUC）为0.63-0.93 （2R3A型0.84,2R3B型0.63,2R3C型0.75，无DRF 0.93）。在外部验证中，该算法的准确率为61%，AUC为0.56 ~ 0.88 （2R3A型0.82,2R3B型0.56,2R3C型0.75，无DRF 0.88）。结论：本文提出的算法在2R3A型DRFs分类和排除DRFs方面具有良好的准确性。然而，根据AO/OTA系统对2R3B和2R3C drf进行分类的准确性较差至中等，类似于有限的外科医生之间的观察者协议。这些结果表明，尽管以前在断裂检测方面表现出色，但cnn算法在分类方面仍存在困难；潜在地显示了这些分类系统的固有问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

European journal of trauma and emergency surgery : official publication of the European Trauma Society

自引率

0.00%

发文量