Jiayi Liu, Peng Sun, Yousheng Yuan, Zihan Chen, Ke Tian, Qian Gao, Xiangsheng Li, Liang Xia, Jun Zhang, Nan Xu
{"title":"基于CT图像的YOLOv12算法辅助外踝撕脱骨折和腓骨下小骨的检测与分类:一项多中心研究。","authors":"Jiayi Liu, Peng Sun, Yousheng Yuan, Zihan Chen, Ke Tian, Qian Gao, Xiangsheng Li, Liang Xia, Jun Zhang, Nan Xu","doi":"10.2196/79064","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Lateral malleolar avulsion fractures (LMAFs) and subfibular ossicles (SFOs) are distinct entities that both present as small bone fragments near the lateral malleolus in imaging but require different treatment strategies. Clinical and radiological differentiation is challenging, which can impede timely and precise management. Magnetic resonance imaging (MRI) is the diagnostic gold standard for differentiating LMAFs from SFOs, whereas radiological differentiation using computed tomography (CT) alone is challenging in routine practice. Deep convolutional neural networks (DCNNs) have shown promise in musculoskeletal imaging diagnostics, but robust, multicenter evidence in this specific context is lacking.</p><p><strong>Objective: </strong>This study aims to evaluate several state-of-the-art DCNNs-including the latest You Only Look Once (YOLO) v12 algorithm-for detecting and classifying LMAFs and SFOs in CT images, using MRI-based diagnoses as the gold standard and to compare model performance with radiologists reading CT alone.</p><p><strong>Methods: </strong>In this retrospective study, 1918 patients (LMAF: n=1253, 65.3%; SFO: n=665, 34.7%) were enrolled from 2 hospitals in China between 2014 and 2024. MRI served as the gold standard and was independently interpreted by 2 senior musculoskeletal radiologists. Only CT images were used for model training, validation, and testing. CT images were manually annotated with bounding boxes. The cohort was randomly split into a training set (n=1092, 56.93%), internal validation set (n=476, 24.82%), and external test set (n=350, 18.25%). Four deep learning models-faster R-CNN, single shot multibox detector (SSD), RetinaNet, and YOLOv12-were trained and evaluated using identical procedures. Model performance was assessed using mean average precision at intersection over union=0.5 (mAP50), area under the receiver operating curve (AUC), accuracy, sensitivity, and specificity. The external test set was also independently interpreted by 2 musculoskeletal radiologists with 7 and 15 years of experience, with results compared with the best-performing model. Saliency maps were generated using Shapley values to enhance interpretability.</p><p><strong>Results: </strong>Among the evaluated models, YOLOv12 achieved the highest detection and classification performance, with a mAP50 of 92.1% and an AUC of 0.983 on the external test set-significantly outperforming faster R-CNN (mAP50 63.7%; AUC 0.79); SSD (mAP50 63%; AUC 0.63); and RetinaNet (mAP50 67.0%; AUC 0.73)-all P<.001. When using CT alone, radiologists performed at a moderate level (accuracy: 75.6% and 69.1%; sensitivity: 75.0% and 65.2%; specificity: 76.0% and 71.1%), whereas YOLOv12 approached MRI-based reference performance (accuracy: 92.0%; sensitivity: 86.7%; specificity: 82.2%). Saliency maps corresponded well with expert-identified regions.</p><p><strong>Conclusions: </strong>While MRI (read by senior radiologists) is the gold standard for distinguishing LMAFs from SFOs, CT-based differentiation is challenging for radiologists. A CT-only DCNN (YOLOv12) achieved substantially higher performance than radiologists interpreting CT alone and approached the MRI-based reference standard, highlighting its potential to augment CT-based decision-making where MRI is limited or unavailable.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":" ","pages":"e79064"},"PeriodicalIF":3.8000,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"YOLOv12 Algorithm-Aided Detection and Classification of Lateral Malleolar Avulsion Fracture and Subfibular Ossicle Based on CT Images: Multicenter Study.\",\"authors\":\"Jiayi Liu, Peng Sun, Yousheng Yuan, Zihan Chen, Ke Tian, Qian Gao, Xiangsheng Li, Liang Xia, Jun Zhang, Nan Xu\",\"doi\":\"10.2196/79064\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Lateral malleolar avulsion fractures (LMAFs) and subfibular ossicles (SFOs) are distinct entities that both present as small bone fragments near the lateral malleolus in imaging but require different treatment strategies. Clinical and radiological differentiation is challenging, which can impede timely and precise management. Magnetic resonance imaging (MRI) is the diagnostic gold standard for differentiating LMAFs from SFOs, whereas radiological differentiation using computed tomography (CT) alone is challenging in routine practice. Deep convolutional neural networks (DCNNs) have shown promise in musculoskeletal imaging diagnostics, but robust, multicenter evidence in this specific context is lacking.</p><p><strong>Objective: </strong>This study aims to evaluate several state-of-the-art DCNNs-including the latest You Only Look Once (YOLO) v12 algorithm-for detecting and classifying LMAFs and SFOs in CT images, using MRI-based diagnoses as the gold standard and to compare model performance with radiologists reading CT alone.</p><p><strong>Methods: </strong>In this retrospective study, 1918 patients (LMAF: n=1253, 65.3%; SFO: n=665, 34.7%) were enrolled from 2 hospitals in China between 2014 and 2024. MRI served as the gold standard and was independently interpreted by 2 senior musculoskeletal radiologists. Only CT images were used for model training, validation, and testing. CT images were manually annotated with bounding boxes. The cohort was randomly split into a training set (n=1092, 56.93%), internal validation set (n=476, 24.82%), and external test set (n=350, 18.25%). Four deep learning models-faster R-CNN, single shot multibox detector (SSD), RetinaNet, and YOLOv12-were trained and evaluated using identical procedures. Model performance was assessed using mean average precision at intersection over union=0.5 (mAP50), area under the receiver operating curve (AUC), accuracy, sensitivity, and specificity. The external test set was also independently interpreted by 2 musculoskeletal radiologists with 7 and 15 years of experience, with results compared with the best-performing model. Saliency maps were generated using Shapley values to enhance interpretability.</p><p><strong>Results: </strong>Among the evaluated models, YOLOv12 achieved the highest detection and classification performance, with a mAP50 of 92.1% and an AUC of 0.983 on the external test set-significantly outperforming faster R-CNN (mAP50 63.7%; AUC 0.79); SSD (mAP50 63%; AUC 0.63); and RetinaNet (mAP50 67.0%; AUC 0.73)-all P<.001. When using CT alone, radiologists performed at a moderate level (accuracy: 75.6% and 69.1%; sensitivity: 75.0% and 65.2%; specificity: 76.0% and 71.1%), whereas YOLOv12 approached MRI-based reference performance (accuracy: 92.0%; sensitivity: 86.7%; specificity: 82.2%). Saliency maps corresponded well with expert-identified regions.</p><p><strong>Conclusions: </strong>While MRI (read by senior radiologists) is the gold standard for distinguishing LMAFs from SFOs, CT-based differentiation is challenging for radiologists. A CT-only DCNN (YOLOv12) achieved substantially higher performance than radiologists interpreting CT alone and approached the MRI-based reference standard, highlighting its potential to augment CT-based decision-making where MRI is limited or unavailable.</p>\",\"PeriodicalId\":56334,\"journal\":{\"name\":\"JMIR Medical Informatics\",\"volume\":\" \",\"pages\":\"e79064\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2025-10-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JMIR Medical Informatics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.2196/79064\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/79064","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
YOLOv12 Algorithm-Aided Detection and Classification of Lateral Malleolar Avulsion Fracture and Subfibular Ossicle Based on CT Images: Multicenter Study.
Background: Lateral malleolar avulsion fractures (LMAFs) and subfibular ossicles (SFOs) are distinct entities that both present as small bone fragments near the lateral malleolus in imaging but require different treatment strategies. Clinical and radiological differentiation is challenging, which can impede timely and precise management. Magnetic resonance imaging (MRI) is the diagnostic gold standard for differentiating LMAFs from SFOs, whereas radiological differentiation using computed tomography (CT) alone is challenging in routine practice. Deep convolutional neural networks (DCNNs) have shown promise in musculoskeletal imaging diagnostics, but robust, multicenter evidence in this specific context is lacking.
Objective: This study aims to evaluate several state-of-the-art DCNNs-including the latest You Only Look Once (YOLO) v12 algorithm-for detecting and classifying LMAFs and SFOs in CT images, using MRI-based diagnoses as the gold standard and to compare model performance with radiologists reading CT alone.
Methods: In this retrospective study, 1918 patients (LMAF: n=1253, 65.3%; SFO: n=665, 34.7%) were enrolled from 2 hospitals in China between 2014 and 2024. MRI served as the gold standard and was independently interpreted by 2 senior musculoskeletal radiologists. Only CT images were used for model training, validation, and testing. CT images were manually annotated with bounding boxes. The cohort was randomly split into a training set (n=1092, 56.93%), internal validation set (n=476, 24.82%), and external test set (n=350, 18.25%). Four deep learning models-faster R-CNN, single shot multibox detector (SSD), RetinaNet, and YOLOv12-were trained and evaluated using identical procedures. Model performance was assessed using mean average precision at intersection over union=0.5 (mAP50), area under the receiver operating curve (AUC), accuracy, sensitivity, and specificity. The external test set was also independently interpreted by 2 musculoskeletal radiologists with 7 and 15 years of experience, with results compared with the best-performing model. Saliency maps were generated using Shapley values to enhance interpretability.
Results: Among the evaluated models, YOLOv12 achieved the highest detection and classification performance, with a mAP50 of 92.1% and an AUC of 0.983 on the external test set-significantly outperforming faster R-CNN (mAP50 63.7%; AUC 0.79); SSD (mAP50 63%; AUC 0.63); and RetinaNet (mAP50 67.0%; AUC 0.73)-all P<.001. When using CT alone, radiologists performed at a moderate level (accuracy: 75.6% and 69.1%; sensitivity: 75.0% and 65.2%; specificity: 76.0% and 71.1%), whereas YOLOv12 approached MRI-based reference performance (accuracy: 92.0%; sensitivity: 86.7%; specificity: 82.2%). Saliency maps corresponded well with expert-identified regions.
Conclusions: While MRI (read by senior radiologists) is the gold standard for distinguishing LMAFs from SFOs, CT-based differentiation is challenging for radiologists. A CT-only DCNN (YOLOv12) achieved substantially higher performance than radiologists interpreting CT alone and approached the MRI-based reference standard, highlighting its potential to augment CT-based decision-making where MRI is limited or unavailable.
期刊介绍:
JMIR Medical Informatics (JMI, ISSN 2291-9694) is a top-rated, tier A journal which focuses on clinical informatics, big data in health and health care, decision support for health professionals, electronic health records, ehealth infrastructures and implementation. It has a focus on applied, translational research, with a broad readership including clinicians, CIOs, engineers, industry and health informatics professionals.
Published by JMIR Publications, publisher of the Journal of Medical Internet Research (JMIR), the leading eHealth/mHealth journal (Impact Factor 2016: 5.175), JMIR Med Inform has a slightly different scope (emphasizing more on applications for clinicians and health professionals rather than consumers/citizens, which is the focus of JMIR), publishes even faster, and also allows papers which are more technical or more formative than what would be published in the Journal of Medical Internet Research.