Z. Wang , J. Crawmer , A. Guermazi , J. Duryea , M. Jarraya
{"title":"DEEP LEARNING MODELS FOR AUTOMATIC JOINT SPACE WIDTH MEASUREMENT","authors":"Z. Wang , J. Crawmer , A. Guermazi , J. Duryea , M. Jarraya","doi":"10.1016/j.ostima.2025.100355","DOIUrl":null,"url":null,"abstract":"<div><h3>INTRODUCTION</h3><div>Accurate and automated measurement of femorotibial JSW (fJSW) is crucial for assessing and monitoring OA. Current semi-automated (SA) fJSW measurement methods can be time-consuming and prone to inter-observer variability. This work describes the evaluation of a deep learning (DL) approach to substantially automate fJSW measurement from knee radiographs.</div></div><div><h3>OBJECTIVE</h3><div>To evaluate the performance of a DL method for automatic fJSW measurement by comparing it to a standard SA method.</div></div><div><h3>METHODS</h3><div>We randomly selected a single knee radiograph from 295 OAI participants (49 knees for each KL grade 0-4) that were not used for DL training. We measured the BL and 48mo. medial fixed-location fJSW at x=0.25 using both the SA and DL methods. fJSW(x=0.25) have been shown to be the most responsive location compared to other fJSW locations and minimum JSW. The SA fJSW measurement consists of a first step to delineate the femur for setting up the necessary coordinate system, followed by a second step to delineate the femur and tibia for measuring fJSW. We trained separate DL algorithms for each step. The models employed an Attention U-Net architecture for segmenting joint spaces. This network enhances the standard U-Net encoder-decoder structure with attention mechanisms. The U-Net's encoder path progressively captures contextual information through a series of convolutional and pooling layers. The decoder path then gradually reconstructs the segmentation map by up-sampling features and combining them with high-resolution features from the encoder via skip connection. To assess performance, we calculated failure rates (assessed visually) for each step, the fJSW<sub>DL</sub> to fJSW<sub>SA</sub> correlation (Pearson’s R), and the responsiveness (standardized response mean: SRM). For DL coordinate system failures, the reader made manual corrections so all knees could be passed to the DL fJSW algorithm.</div></div><div><h3>RESULTS</h3><div>There were 58 coordinate systems failures (11.7%) with a KL distribution as follows: KL0:2, KL1:7, KL2:4, KL3:9, KL4:36, and 31 fJSW (6.2%) failures distributed as follows: KL0:4, KL1:1, KL2:4, KL3:7, KL4:15. We excluded the JSW failures leaving knees from 215 participants for the correlation and responsiveness analyses. The Pearson’s correlation was R = 0.97 and the SRM values were -0.64 (SA) and -0.67 (DL). Figure 1 is a Bland-Altman plot comparing the SA and DL fJSW, showing a minor bias and few outliers.</div></div><div><h3>CONCLUSION</h3><div>The results demonstrate that a DL algorithm can measure fJSW accurately with equivalent or better responsiveness compared to the SA method, dramatically reducing the reader time while maintaining performance. The majority of the failures were for KL4 knees, which are less utilized for KOA studies. The DL software has the potential to be used in very large studies and clinical trials of KOA.</div></div>","PeriodicalId":74378,"journal":{"name":"Osteoarthritis imaging","volume":"5 ","pages":"Article 100355"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Osteoarthritis imaging","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772654125000959","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
INTRODUCTION
Accurate and automated measurement of femorotibial JSW (fJSW) is crucial for assessing and monitoring OA. Current semi-automated (SA) fJSW measurement methods can be time-consuming and prone to inter-observer variability. This work describes the evaluation of a deep learning (DL) approach to substantially automate fJSW measurement from knee radiographs.
OBJECTIVE
To evaluate the performance of a DL method for automatic fJSW measurement by comparing it to a standard SA method.
METHODS
We randomly selected a single knee radiograph from 295 OAI participants (49 knees for each KL grade 0-4) that were not used for DL training. We measured the BL and 48mo. medial fixed-location fJSW at x=0.25 using both the SA and DL methods. fJSW(x=0.25) have been shown to be the most responsive location compared to other fJSW locations and minimum JSW. The SA fJSW measurement consists of a first step to delineate the femur for setting up the necessary coordinate system, followed by a second step to delineate the femur and tibia for measuring fJSW. We trained separate DL algorithms for each step. The models employed an Attention U-Net architecture for segmenting joint spaces. This network enhances the standard U-Net encoder-decoder structure with attention mechanisms. The U-Net's encoder path progressively captures contextual information through a series of convolutional and pooling layers. The decoder path then gradually reconstructs the segmentation map by up-sampling features and combining them with high-resolution features from the encoder via skip connection. To assess performance, we calculated failure rates (assessed visually) for each step, the fJSWDL to fJSWSA correlation (Pearson’s R), and the responsiveness (standardized response mean: SRM). For DL coordinate system failures, the reader made manual corrections so all knees could be passed to the DL fJSW algorithm.
RESULTS
There were 58 coordinate systems failures (11.7%) with a KL distribution as follows: KL0:2, KL1:7, KL2:4, KL3:9, KL4:36, and 31 fJSW (6.2%) failures distributed as follows: KL0:4, KL1:1, KL2:4, KL3:7, KL4:15. We excluded the JSW failures leaving knees from 215 participants for the correlation and responsiveness analyses. The Pearson’s correlation was R = 0.97 and the SRM values were -0.64 (SA) and -0.67 (DL). Figure 1 is a Bland-Altman plot comparing the SA and DL fJSW, showing a minor bias and few outliers.
CONCLUSION
The results demonstrate that a DL algorithm can measure fJSW accurately with equivalent or better responsiveness compared to the SA method, dramatically reducing the reader time while maintaining performance. The majority of the failures were for KL4 knees, which are less utilized for KOA studies. The DL software has the potential to be used in very large studies and clinical trials of KOA.