Kellen L Mulford, Austin F Grove, Elizabeth S Kaji, Pouria Rouzrokh, Ryan Roman, Mete Kremers, Hilal Maradit Kremers, Michael J Taunton, Cody C Wyles
{"title":"Uncertainty-Aware Deep Learning Characterization of Knee Radiographs for Large-Scale Registry Creation.","authors":"Kellen L Mulford, Austin F Grove, Elizabeth S Kaji, Pouria Rouzrokh, Ryan Roman, Mete Kremers, Hilal Maradit Kremers, Michael J Taunton, Cody C Wyles","doi":"10.1016/j.arth.2024.10.103","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>We present an automated image ingestion pipeline for a knee radiography registry, integrating a multilabel image-semantic classifier with conformal prediction-based uncertainty quantification and an object detection model for knee hardware.</p><p><strong>Methods: </strong>Annotators retrospectively classified 26,000 knee images detailing presence, laterality, prostheses, and radiographic views. They further annotated surgical construct locations in 11,841 knee radiographs. An uncertainty-aware multilabel EfficientNet-based classifier was trained to identify the knee laterality, implants, and radiographic view. A classifier trained with embeddings from the EfficientNet model detected out-of-domain images. An object detection model was trained to identify 20 different knee implants. Model performance was assessed against a held-out internal and an external dataset using per-class F1 score, accuracy, sensitivity, and specificity. Conformal prediction was evaluated with marginal coverage and efficiency.</p><p><strong>Results: </strong>Classification Model with Conformal Prediction: F1 scores for each label output > 0.98. Coverage of each label output was >0.99 and the average efficiency was 0.97.</p><p><strong>Domain detection model: </strong>The F1 score was 0.99, with precision and recall for knee radiographs of 0.99.</p><p><strong>Object detection model: </strong>Mean average precision across all classes was 0.945 and ranged from 0.695 to 1.000. Average precision and recall across all classes were 0.950 and 0.886.</p><p><strong>Conclusions: </strong>We present a multilabel classifier with domain detection and an object detection model to characterize knee radiographs. Conformal prediction enhances transparency in cases when the model is uncertain.</p>","PeriodicalId":51077,"journal":{"name":"Journal of Arthroplasty","volume":null,"pages":null},"PeriodicalIF":3.4000,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Arthroplasty","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.arth.2024.10.103","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ORTHOPEDICS","Score":null,"Total":0}
引用次数: 0
Abstract
Background: We present an automated image ingestion pipeline for a knee radiography registry, integrating a multilabel image-semantic classifier with conformal prediction-based uncertainty quantification and an object detection model for knee hardware.
Methods: Annotators retrospectively classified 26,000 knee images detailing presence, laterality, prostheses, and radiographic views. They further annotated surgical construct locations in 11,841 knee radiographs. An uncertainty-aware multilabel EfficientNet-based classifier was trained to identify the knee laterality, implants, and radiographic view. A classifier trained with embeddings from the EfficientNet model detected out-of-domain images. An object detection model was trained to identify 20 different knee implants. Model performance was assessed against a held-out internal and an external dataset using per-class F1 score, accuracy, sensitivity, and specificity. Conformal prediction was evaluated with marginal coverage and efficiency.
Results: Classification Model with Conformal Prediction: F1 scores for each label output > 0.98. Coverage of each label output was >0.99 and the average efficiency was 0.97.
Domain detection model: The F1 score was 0.99, with precision and recall for knee radiographs of 0.99.
Object detection model: Mean average precision across all classes was 0.945 and ranged from 0.695 to 1.000. Average precision and recall across all classes were 0.950 and 0.886.
Conclusions: We present a multilabel classifier with domain detection and an object detection model to characterize knee radiographs. Conformal prediction enhances transparency in cases when the model is uncertain.
期刊介绍:
The Journal of Arthroplasty brings together the clinical and scientific foundations for joint replacement. This peer-reviewed journal publishes original research and manuscripts of the highest quality from all areas relating to joint replacement or the treatment of its complications, including those dealing with clinical series and experience, prosthetic design, biomechanics, biomaterials, metallurgy, biologic response to arthroplasty materials in vivo and in vitro.