Deep learning for automated hip fracture detection and classification : achieving superior accuracy.

IF 4.6 1区医学 Q1 ORTHOPEDICS

Bone & Joint Journal Pub Date : 2025-02-01 DOI:10.1302/0301-620X.107B2.BJJ-2024-0791.R1

Zhiqian Zheng, Byeong Y Ryu, Sung E Kim, Dae S Song, Seong H Kim, Jung-Wee Park, Du H Ro

{"title":"Deep learning for automated hip fracture detection and classification : achieving superior accuracy.","authors":"Zhiqian Zheng, Byeong Y Ryu, Sung E Kim, Dae S Song, Seong H Kim, Jung-Wee Park, Du H Ro","doi":"10.1302/0301-620X.107B2.BJJ-2024-0791.R1","DOIUrl":null,"url":null,"abstract":"Aims: The aim of this study was to develop and evaluate a deep learning-based model for classification of hip fractures to enhance diagnostic accuracy.Methods: A retrospective study used 5,168 hip anteroposterior radiographs, with 4,493 radiographs from two institutes (internal dataset) for training and 675 radiographs from another institute for validation. A convolutional neural network (CNN)-based classification model was trained on four types of hip fractures (Displaced, Valgus-impacted, Stable, and Unstable), using DAMO-YOLO for data processing and augmentation. The model's accuracy, sensitivity, specificity, Intersection over Union (IoU), and Dice coefficient were evaluated. Orthopaedic surgeons' diagnoses served as the reference standard, with comparisons made before and after artificial intelligence assistance.Results: The accuracy, sensitivity, specificity, IoU, and Dice coefficients of the model for the four fracture categories in the internal dataset were as follows: Displaced (1.0, 0.79, 1.0, 0.70, 0.82), Valgus-impacted (1.0, 0.80, 1.0, 0.70, 0.82), Stable (0.99, 0.95, 0.99, 0.83, 0.89), and Unstable (1.0, 0.98, 0.99, 0.86, 0.92), respectively. For the external validation dataset, the sensitivity and specificity were as follows: Displaced (0.83, 0.94), Valgus-impacted (0.89, 0.90), Stable (0.88, 0.95), and Unstable (0.85, 0.99), respectively. The overall means (Micro AVG and Macro AVG) for the external dataset were Micro AVG (0.83 (SD 0.05), 0.96 (SD 0.01)) and Macro AVG (0.69 (SD 0.02), 0.95 (SD 0.02)), respectively.Conclusion: Compared to human diagnosis alone, our study demonstrates that the developed model significantly improves the accuracy of detecting and classifying hip fractures. Our model has shown great potential in assisting clinicians with the accurate diagnosis and classification of hip fractures.","PeriodicalId":48944,"journal":{"name":"Bone & Joint Journal","volume":"107-B 2","pages":"213-220"},"PeriodicalIF":4.6000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bone & Joint Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1302/0301-620X.107B2.BJJ-2024-0791.R1","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ORTHOPEDICS","Score":null,"Total":0}

引用次数: 0

Abstract

Aims: The aim of this study was to develop and evaluate a deep learning-based model for classification of hip fractures to enhance diagnostic accuracy.

Methods: A retrospective study used 5,168 hip anteroposterior radiographs, with 4,493 radiographs from two institutes (internal dataset) for training and 675 radiographs from another institute for validation. A convolutional neural network (CNN)-based classification model was trained on four types of hip fractures (Displaced, Valgus-impacted, Stable, and Unstable), using DAMO-YOLO for data processing and augmentation. The model's accuracy, sensitivity, specificity, Intersection over Union (IoU), and Dice coefficient were evaluated. Orthopaedic surgeons' diagnoses served as the reference standard, with comparisons made before and after artificial intelligence assistance.

Results: The accuracy, sensitivity, specificity, IoU, and Dice coefficients of the model for the four fracture categories in the internal dataset were as follows: Displaced (1.0, 0.79, 1.0, 0.70, 0.82), Valgus-impacted (1.0, 0.80, 1.0, 0.70, 0.82), Stable (0.99, 0.95, 0.99, 0.83, 0.89), and Unstable (1.0, 0.98, 0.99, 0.86, 0.92), respectively. For the external validation dataset, the sensitivity and specificity were as follows: Displaced (0.83, 0.94), Valgus-impacted (0.89, 0.90), Stable (0.88, 0.95), and Unstable (0.85, 0.99), respectively. The overall means (Micro AVG and Macro AVG) for the external dataset were Micro AVG (0.83 (SD 0.05), 0.96 (SD 0.01)) and Macro AVG (0.69 (SD 0.02), 0.95 (SD 0.02)), respectively.

Conclusion: Compared to human diagnosis alone, our study demonstrates that the developed model significantly improves the accuracy of detecting and classifying hip fractures. Our model has shown great potential in assisting clinicians with the accurate diagnosis and classification of hip fractures.

查看原文本刊更多论文

用于髋部骨折自动检测和分类的深度学习：实现卓越的准确性。

目的：本研究的目的是开发和评估基于深度学习的髋部骨折分类模型，以提高诊断准确性。方法：回顾性研究使用5168张髋关节正位x线片，其中4493张来自两个研究所（内部数据集）用于培训，675张来自另一个研究所用于验证。基于卷积神经网络（CNN）对4种髋部骨折类型（移位型、外翻型、稳定型和不稳定型）进行分类训练，使用DAMO-YOLO进行数据处理和增强。评估了模型的准确性、敏感性、特异性、交汇比（Intersection over Union, IoU）和Dice系数。将骨科医生的诊断作为参考标准，并对人工智能辅助前后进行比较。结果：该模型对内部数据集中四种骨折类型的准确性、敏感性、特异性、IoU和Dice系数分别为：移位（1.0、0.79、1.0、0.70、0.82）、外翻冲击（1.0、0.80、1.0、0.70、0.82）、稳定（0.99、0.95、0.99、0.83、0.89）和不稳定（1.0、0.98、0.99、0.86、0.92）。对于外部验证数据集，敏感性和特异性分别为：移位（0.83,0.94），外翻影响（0.89,0.90），稳定（0.88,0.95）和不稳定（0.85,0.99）。外部数据集的总体平均值（Micro AVG和Macro AVG）分别为Micro AVG (0.83 （SD 0.05), 0.96 (SD 0.01)）和Macro AVG (0.69 （SD 0.02), 0.95 (SD 0.02)）。结论：与人类单独诊断相比，我们的研究表明，所开发的模型显著提高了髋部骨折检测和分类的准确性。我们的模型在帮助临床医生准确诊断和分类髋部骨折方面显示出巨大的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Bone & Joint Journal ORTHOPEDICS-SURGERY

CiteScore

9.40

自引率

10.90%

发文量

318

期刊介绍： We welcome original articles from any part of the world. The papers are assessed by members of the Editorial Board and our international panel of expert reviewers, then either accepted for publication or rejected by the Editor. We receive over 2000 submissions each year and accept about 250 for publication, many after revisions recommended by the reviewers, editors or statistical advisers. A decision usually takes between six and eight weeks. Each paper is assessed by two reviewers with a special interest in the subject covered by the paper, and also by members of the editorial team. Controversial papers will be discussed at a full meeting of the Editorial Board. Publication is between four and six months after acceptance.