Development and validation of deep learning models for identifying the brand of pedicle screws on plain spine radiographs

IF 3.9 3区医学 Q1 ORTHOPEDICS

JOR Spine Pub Date : 2024-09-17 DOI:10.1002/jsp2.70001

Yu-Cheng Yao, Cheng-Li Lin, Hung-Hsun Chen, Hsi-Hsien Lin, Wei Hsiung, Shih-Tien Wang, Ying-Chou Sun, Yu-Hsuan Tang, Po-Hsin Chou

{"title":"Development and validation of deep learning models for identifying the brand of pedicle screws on plain spine radiographs","authors":"Yu-Cheng Yao, Cheng-Li Lin, Hung-Hsun Chen, Hsi-Hsien Lin, Wei Hsiung, Shih-Tien Wang, Ying-Chou Sun, Yu-Hsuan Tang, Po-Hsin Chou","doi":"10.1002/jsp2.70001","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>In spinal revision surgery, previous pedicle screws (PS) may need to be replaced with new implants. Failure to accurately identify the brand of PS-based instrumentation preoperatively may increase the risk of perioperative complications. This study aimed to develop and validate an optimal deep learning (DL) model to identify the brand of PS-based instrumentation on plain radiographs of spine (PRS) using anteroposterior (AP) and lateral images.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>A total of 529 patients who received PS-based instrumentation from seven manufacturers were enrolled in this retrospective study. The postoperative PRS were gathered as ground truths. The training, validation, and testing datasets contained 338, 85, and 106 patients, respectively. YOLOv5 was used to crop out the screws' trajectory, and the EfficientNet-b0 model was used to develop single models (AP, Lateral, Merge, and Concatenated) based on the different PRS images. The ensemble models were different combinations of the single models. Primary outcomes were the models' performance in accuracy, sensitivity, precision, F1-score, kappa value, and area under the curve (AUC). Secondary outcomes were the relative performance of models versus human readers and external validation of the DL models.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>The Lateral model had the most stable performance among single models. The discriminative performance was improved by the ensemble method. The AP + Lateral ensemble model had the most stable performance, with an accuracy of 0.9434, F1 score of 0.9388, and AUC of 0.9834. The performance of the ensemble models was comparable to that of experienced orthopedic surgeons and superior to that of inexperienced orthopedic surgeons. External validation revealed that the Lat + Concat ensemble model had the best accuracy (0.9412).</p>\n </section>\n \n <section>\n \n <h3> Conclusion</h3>\n \n <p>The DL models demonstrated stable performance in identifying the brand of PS-based instrumentation based on AP and/or lateral images of PRS, which may assist orthopedic spine surgeons in preoperative revision planning in clinical practice.</p>\n </section>\n </div>","PeriodicalId":14876,"journal":{"name":"JOR Spine","volume":"7 3","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/jsp2.70001","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JOR Spine","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/jsp2.70001","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ORTHOPEDICS","Score":null,"Total":0}

引用次数: 0

Abstract

Background

In spinal revision surgery, previous pedicle screws (PS) may need to be replaced with new implants. Failure to accurately identify the brand of PS-based instrumentation preoperatively may increase the risk of perioperative complications. This study aimed to develop and validate an optimal deep learning (DL) model to identify the brand of PS-based instrumentation on plain radiographs of spine (PRS) using anteroposterior (AP) and lateral images.

Methods

A total of 529 patients who received PS-based instrumentation from seven manufacturers were enrolled in this retrospective study. The postoperative PRS were gathered as ground truths. The training, validation, and testing datasets contained 338, 85, and 106 patients, respectively. YOLOv5 was used to crop out the screws' trajectory, and the EfficientNet-b0 model was used to develop single models (AP, Lateral, Merge, and Concatenated) based on the different PRS images. The ensemble models were different combinations of the single models. Primary outcomes were the models' performance in accuracy, sensitivity, precision, F1-score, kappa value, and area under the curve (AUC). Secondary outcomes were the relative performance of models versus human readers and external validation of the DL models.

Results

The Lateral model had the most stable performance among single models. The discriminative performance was improved by the ensemble method. The AP + Lateral ensemble model had the most stable performance, with an accuracy of 0.9434, F1 score of 0.9388, and AUC of 0.9834. The performance of the ensemble models was comparable to that of experienced orthopedic surgeons and superior to that of inexperienced orthopedic surgeons. External validation revealed that the Lat + Concat ensemble model had the best accuracy (0.9412).

Conclusion

The DL models demonstrated stable performance in identifying the brand of PS-based instrumentation based on AP and/or lateral images of PRS, which may assist orthopedic spine surgeons in preoperative revision planning in clinical practice.

Abstract Image

查看原文本刊更多论文

开发和验证深度学习模型，用于识别脊柱平片上的椎弓根螺钉品牌

背景在脊柱翻修手术中，以前的椎弓根螺钉（PS）可能需要更换为新的植入物。如果术前不能准确识别椎弓根螺钉的品牌，可能会增加围手术期并发症的风险。本研究旨在开发并验证一种最佳深度学习（DL）模型，该模型可使用前后位（AP）和侧位图像在脊柱平片（PRS）上识别基于 PS 的器械的品牌。方法在这项回顾性研究中，共有 529 名患者接受了来自 7 家制造商的 PS 型器械。收集术后 PRS 作为基本事实。训练、验证和测试数据集分别包含 338、85 和 106 名患者。YOLOv5 用于裁剪螺钉轨迹，EfficientNet-b0 模型用于根据不同的 PRS 图像开发单一模型（AP、Lateral、Merge 和 Concatenated）。集合模型是单一模型的不同组合。主要结果是模型在准确度、灵敏度、精确度、F1-分数、卡帕值和曲线下面积（AUC）方面的表现。次要结果是模型与人类读者的相对性能以及 DL 模型的外部验证。结果在单一模型中，侧向模型的性能最稳定。集合方法提高了判别性能。AP + Lateral 集合模型的性能最稳定，准确率为 0.9434，F1 得分为 0.9388，AUC 为 0.9834。集合模型的性能与经验丰富的骨科医生相当，优于经验不足的骨科医生。外部验证显示，Lat + Concat 组合模型的准确度最高（0.9412）。结论 DL 模型在根据 PRS 的 AP 和/或侧位图像识别 PS 型器械的品牌方面表现稳定，可帮助脊柱矫形外科医生在临床实践中制定术前翻修计划。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊