Detection, classification, and characterization of proximal humerus fractures on plain radiographs.

IF 4.9 1区医学 Q1 ORTHOPEDICS

Bone & Joint Journal Pub Date : 2024-11-01 DOI:10.1302/0301-620X.106B11.BJJ-2024-0264.R1

Reinier W A Spek, William J Smith, Marat Sverdlov, Sebastiaan Broos, Yang Zhao, Zhibin Liao, Johan W Verjans, Jasper Prijs, Minh-Son To, Henrik Åberg, Wael Chiri, Frank F A IJpma, Bhavin Jadav, John White, Gregory I Bain, Paul C Jutte, Michel P J van den Bekerom, Ruurd L Jaarsma, Job N Doornberg, Soheil Ashkani, Nick Assink, Joost W Colaris, Nynke V der Gaast, Prakash Jayakumar, Laura J Kim, Huub H de Klerk, Joost Kuipers, Wouter H Mallee, Anne M L Meesters, Stijn R J Mennes, Miriam G E Oldhof, Peter A J Pijpker, Ching Yiu Lau, Mathieu M E Wijffels, Arno D Wolf

{"title":"Detection, classification, and characterization of proximal humerus fractures on plain radiographs.","authors":"Reinier W A Spek, William J Smith, Marat Sverdlov, Sebastiaan Broos, Yang Zhao, Zhibin Liao, Johan W Verjans, Jasper Prijs, Minh-Son To, Henrik Åberg, Wael Chiri, Frank F A IJpma, Bhavin Jadav, John White, Gregory I Bain, Paul C Jutte, Michel P J van den Bekerom, Ruurd L Jaarsma, Job N Doornberg, Soheil Ashkani, Nick Assink, Joost W Colaris, Nynke V der Gaast, Prakash Jayakumar, Laura J Kim, Huub H de Klerk, Joost Kuipers, Wouter H Mallee, Anne M L Meesters, Stijn R J Mennes, Miriam G E Oldhof, Peter A J Pijpker, Ching Yiu Lau, Mathieu M E Wijffels, Arno D Wolf","doi":"10.1302/0301-620X.106B11.BJJ-2024-0264.R1","DOIUrl":null,"url":null,"abstract":"Aims: The purpose of this study was to develop a convolutional neural network (CNN) for fracture detection, classification, and identification of greater tuberosity displacement ≥ 1 cm, neck-shaft angle (NSA) ≤ 100°, shaft translation, and articular fracture involvement, on plain radiographs.Methods: The CNN was trained and tested on radiographs sourced from 11 hospitals in Australia and externally validated on radiographs from the Netherlands. Each radiograph was paired with corresponding CT scans to serve as the reference standard based on dual independent evaluation by trained researchers and attending orthopaedic surgeons. Presence of a fracture, classification (non- to minimally displaced; two-part, multipart, and glenohumeral dislocation), and four characteristics were determined on 2D and 3D CT scans and subsequently allocated to each series of radiographs. Fracture characteristics included greater tuberosity displacement ≥ 1 cm, NSA ≤ 100°, shaft translation (0% to < 75%, 75% to 95%, > 95%), and the extent of articular involvement (0% to < 15%, 15% to 35%, or > 35%).Results: For detection and classification, the algorithm was trained on 1,709 radiographs (n = 803), tested on 567 radiographs (n = 244), and subsequently externally validated on 535 radiographs (n = 227). For characterization, healthy shoulders and glenohumeral dislocation were excluded. The overall accuracy for fracture detection was 94% (area under the receiver operating characteristic curve (AUC) = 0.98) and for classification 78% (AUC 0.68 to 0.93). Accuracy to detect greater tuberosity fracture displacement ≥ 1 cm was 35.0% (AUC 0.57). The CNN did not recognize NSAs ≤ 100° (AUC 0.42), nor fractures with ≥ 75% shaft translation (AUC 0.51 to 0.53), or with ≥ 15% articular involvement (AUC 0.48 to 0.49). For all objectives, the model's performance on the external dataset showed similar accuracy levels.Conclusion: CNNs proficiently rule out proximal humerus fractures on plain radiographs. Despite rigorous training methodology based on CT imaging with multi-rater consensus to serve as the reference standard, artificial intelligence-driven classification is insufficient for clinical implementation. The CNN exhibited poor diagnostic ability to detect greater tuberosity displacement ≥ 1 cm and failed to identify NSAs ≤ 100°, shaft translations, or articular fractures.","PeriodicalId":48944,"journal":{"name":"Bone & Joint Journal","volume":"106-B 11","pages":"1348-1360"},"PeriodicalIF":4.9000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bone & Joint Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1302/0301-620X.106B11.BJJ-2024-0264.R1","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ORTHOPEDICS","Score":null,"Total":0}

引用次数: 0

Abstract

Aims: The purpose of this study was to develop a convolutional neural network (CNN) for fracture detection, classification, and identification of greater tuberosity displacement ≥ 1 cm, neck-shaft angle (NSA) ≤ 100°, shaft translation, and articular fracture involvement, on plain radiographs.

Methods: The CNN was trained and tested on radiographs sourced from 11 hospitals in Australia and externally validated on radiographs from the Netherlands. Each radiograph was paired with corresponding CT scans to serve as the reference standard based on dual independent evaluation by trained researchers and attending orthopaedic surgeons. Presence of a fracture, classification (non- to minimally displaced; two-part, multipart, and glenohumeral dislocation), and four characteristics were determined on 2D and 3D CT scans and subsequently allocated to each series of radiographs. Fracture characteristics included greater tuberosity displacement ≥ 1 cm, NSA ≤ 100°, shaft translation (0% to < 75%, 75% to 95%, > 95%), and the extent of articular involvement (0% to < 15%, 15% to 35%, or > 35%).

Results: For detection and classification, the algorithm was trained on 1,709 radiographs (n = 803), tested on 567 radiographs (n = 244), and subsequently externally validated on 535 radiographs (n = 227). For characterization, healthy shoulders and glenohumeral dislocation were excluded. The overall accuracy for fracture detection was 94% (area under the receiver operating characteristic curve (AUC) = 0.98) and for classification 78% (AUC 0.68 to 0.93). Accuracy to detect greater tuberosity fracture displacement ≥ 1 cm was 35.0% (AUC 0.57). The CNN did not recognize NSAs ≤ 100° (AUC 0.42), nor fractures with ≥ 75% shaft translation (AUC 0.51 to 0.53), or with ≥ 15% articular involvement (AUC 0.48 to 0.49). For all objectives, the model's performance on the external dataset showed similar accuracy levels.

Conclusion: CNNs proficiently rule out proximal humerus fractures on plain radiographs. Despite rigorous training methodology based on CT imaging with multi-rater consensus to serve as the reference standard, artificial intelligence-driven classification is insufficient for clinical implementation. The CNN exhibited poor diagnostic ability to detect greater tuberosity displacement ≥ 1 cm and failed to identify NSAs ≤ 100°, shaft translations, or articular fractures.

查看原文本刊更多论文

肱骨近端骨折的检测、分类和特征描述。

目的：本研究的目的是开发一种卷积神经网络（CNN），用于在普通X光片上进行骨折检测、分类和识别大结节移位≥1厘米、颈轴角（NSA）≤100°、轴移位和关节骨折受累：方法：对来自澳大利亚 11 家医院的 X 光片进行了 CNN 培训和测试，并对来自荷兰的 X 光片进行了外部验证。根据训练有素的研究人员和骨科主治医生的双重独立评估，每张射线照片都与相应的 CT 扫描配对，作为参考标准。二维和三维 CT 扫描确定了骨折的存在、分类（非移位至轻微移位；两部分、多部分和盂肱关节脱位）和四个特征，随后将其分配到每一系列射线照片中。骨折特征包括大结节移位≥1厘米、NSA≤100°、轴移位（0%至<75%、75%至95%、>95%）和关节受累程度（0%至<15%、15%至35%或>35%）：在检测和分类方面，该算法在1709张X光片（n = 803）上进行了训练，在567张X光片（n = 244）上进行了测试，随后在535张X光片（n = 227）上进行了外部验证。为了确定特征，健康肩部和盂肱关节脱位被排除在外。骨折检测的总体准确率为94%（接收者操作特征曲线下面积（AUC）= 0.98），分类准确率为78%（AUC为0.68至0.93）。检测大结节骨折移位≥ 1 厘米的准确率为 35.0%（AUC 0.57）。CNN 不能识别≤100°的非损伤性关节炎（AUC 0.42），也不能识别轴移位≥75% 的骨折（AUC 0.51 至 0.53）或关节受累≥15% 的骨折（AUC 0.48 至 0.49）。对于所有目标，模型在外部数据集上的表现都显示出相似的准确度：结论：CNN 能在普通X光片上熟练排除肱骨近端骨折。尽管采用了基于 CT 成像的严格训练方法，并将多方共识作为参考标准，但人工智能驱动的分类仍不足以在临床上应用。CNN在检测大结节移位≥1厘米方面的诊断能力较差，而且无法识别≤100°的非损伤性关节炎、轴移位或关节骨折。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Bone & Joint Journal ORTHOPEDICS-SURGERY

CiteScore

9.40

自引率

10.90%

发文量

318

期刊介绍： We welcome original articles from any part of the world. The papers are assessed by members of the Editorial Board and our international panel of expert reviewers, then either accepted for publication or rejected by the Editor. We receive over 2000 submissions each year and accept about 250 for publication, many after revisions recommended by the reviewers, editors or statistical advisers. A decision usually takes between six and eight weeks. Each paper is assessed by two reviewers with a special interest in the subject covered by the paper, and also by members of the editorial team. Controversial papers will be discussed at a full meeting of the Editorial Board. Publication is between four and six months after acceptance.