{"title":"A Deep Learning-Based Clinical Classification System for the Differential Diagnosis of Hip Prosthesis Failures Using Radiographs: A Multicenter Study.","authors":"Limin Wu,Biao Wang,Bin Lin,Mingyang Li,Yuangang Wu,Haibo Si,Yi Zeng,Liangji Lu,Lulu Gao,Zheting Chen,Risheng Yu,Liang Zhao,Yong Nie,Kang Li,Bin Shen","doi":"10.2106/jbjs.24.01601","DOIUrl":null,"url":null,"abstract":"BACKGROUND\r\nAccurate and timely differential diagnosis of hip prosthesis failures remains a major clinical challenge. Radiographic examination remains the most cost-effective and common first-line imaging modality for hip prostheses, and integrating deep learning has the potential to improve its diagnostic accuracy and efficiency.\r\n\r\nMETHODS\r\nA deep learning-based clinical classification system (Hip-Net) was developed to classify multiple causes of total hip arthroplasty failure, including periprosthetic joint infection (PJI), aseptic loosening, dislocation, periprosthetic fracture, and polyethylene wear. Hip-Net employed a dual-channel ensemble of 4 deep learning models trained on 2,908 routine dual-view (anteroposterior and lateral) radiographs for 1,454 patients (Asian) across 3 medical centers. An interpretive subnetwork generated spatially resolved disease probability maps. Discrimination performance and interpretability were tested in external and prospective cohorts, respectively. The correlation between model-generated individual PJI risk and inflammatory biomarkers was assessed.\r\n\r\nRESULTS\r\nHip-Net demonstrated strong generalizability across different settings, effectively distinguishing between 5 common types of hip prosthesis failures with an accuracy of 0.904 (95% confidence interval [CI], 0.894 to 0.914) and an area under the receiver operating characteristic curve (AUC) of 0.937 (95% CI, 0.925 to 0.948) in the external cohort. The spatially resolved disease-probability maps for PJI closely aligned with intraoperative and pathological findings. The model-generated individual PJI risk scores exhibited a positive correlation with the C-reactive protein (CRP) level and erythrocyte sedimentation rate (ESR).\r\n\r\nCONCLUSIONS\r\nHip-Net provided a clinically applicable strategy for accurately classifying and characterizing multiple etiologies of hip prosthesis failure. Such an approach is highly beneficial for providing interpretable, pathology-aligned probability maps that enhance the understanding of PJI. Its integration into clinical workflows may streamline decision-making and improve patient outcomes.\r\n\r\nLEVEL OF EVIDENCE\r\nPrognostic Level III. See Instructions for Authors for a complete description of levels of evidence.","PeriodicalId":22625,"journal":{"name":"The Journal of Bone & Joint Surgery","volume":"13 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Journal of Bone & Joint Surgery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2106/jbjs.24.01601","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
BACKGROUND
Accurate and timely differential diagnosis of hip prosthesis failures remains a major clinical challenge. Radiographic examination remains the most cost-effective and common first-line imaging modality for hip prostheses, and integrating deep learning has the potential to improve its diagnostic accuracy and efficiency.
METHODS
A deep learning-based clinical classification system (Hip-Net) was developed to classify multiple causes of total hip arthroplasty failure, including periprosthetic joint infection (PJI), aseptic loosening, dislocation, periprosthetic fracture, and polyethylene wear. Hip-Net employed a dual-channel ensemble of 4 deep learning models trained on 2,908 routine dual-view (anteroposterior and lateral) radiographs for 1,454 patients (Asian) across 3 medical centers. An interpretive subnetwork generated spatially resolved disease probability maps. Discrimination performance and interpretability were tested in external and prospective cohorts, respectively. The correlation between model-generated individual PJI risk and inflammatory biomarkers was assessed.
RESULTS
Hip-Net demonstrated strong generalizability across different settings, effectively distinguishing between 5 common types of hip prosthesis failures with an accuracy of 0.904 (95% confidence interval [CI], 0.894 to 0.914) and an area under the receiver operating characteristic curve (AUC) of 0.937 (95% CI, 0.925 to 0.948) in the external cohort. The spatially resolved disease-probability maps for PJI closely aligned with intraoperative and pathological findings. The model-generated individual PJI risk scores exhibited a positive correlation with the C-reactive protein (CRP) level and erythrocyte sedimentation rate (ESR).
CONCLUSIONS
Hip-Net provided a clinically applicable strategy for accurately classifying and characterizing multiple etiologies of hip prosthesis failure. Such an approach is highly beneficial for providing interpretable, pathology-aligned probability maps that enhance the understanding of PJI. Its integration into clinical workflows may streamline decision-making and improve patient outcomes.
LEVEL OF EVIDENCE
Prognostic Level III. See Instructions for Authors for a complete description of levels of evidence.