W. Yuh, E. Khil, Y. Yoon, Burnyoung Kim, Hongjun Yoon, Jihe Lim, Kyoung Yeon Lee, Yeong Seo Yoo, Kyeong Deuk An
{"title":"Deep Learning-Assisted Quantitative Measurement of Thoracolumbar Fracture Features on Lateral Radiographs","authors":"W. Yuh, E. Khil, Y. Yoon, Burnyoung Kim, Hongjun Yoon, Jihe Lim, Kyoung Yeon Lee, Yeong Seo Yoo, Kyeong Deuk An","doi":"10.14245/ns.2347366.683","DOIUrl":null,"url":null,"abstract":"Objective This study aimed to develop and validate a deep learning (DL) algorithm for the quantitative measurement of thoracolumbar (TL) fracture features, and to evaluate its efficacy across varying levels of clinical expertise. Methods Using the pretrained Mask Region-Based Convolutional Neural Networks model, originally developed for vertebral body segmentation and fracture detection, we fine-tuned the model and added a new module for measuring fracture metrics—compression rate (CR), Cobb angle (CA), Gardner angle (GA), and sagittal index (SI)—from lumbar spine lateral radiographs. These metrics were derived from six-point labeling by 3 radiologists, forming the ground truth (GT). Training utilized 1,000 nonfractured and 318 fractured radiographs, while validations employed 213 internal and 200 external fractured radiographs. The accuracy of the DL algorithm in quantifying fracture features was evaluated against GT using the intraclass correlation coefficient. Additionally, 4 readers with varying expertise levels, including trainees and an attending spine surgeon, performed measurements with and without DL assistance, and their results were compared to GT and the DL model. Results The DL algorithm demonstrated good to excellent agreement with GT for CR, CA, GA, and SI in both internal (0.860, 0.944, 0.932, and 0.779, respectively) and external (0.836, 0.940, 0.916, and 0.815, respectively) validations. DL-assisted measurements significantly improved most measurement values, particularly for trainees. Conclusion The DL algorithm was validated as an accurate tool for quantifying TL fracture features using radiographs. DL-assisted measurement is expected to expedite the diagnostic process and enhance reliability, particularly benefiting less experienced clinicians.","PeriodicalId":19269,"journal":{"name":"Neurospine","volume":null,"pages":null},"PeriodicalIF":3.8000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurospine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.14245/ns.2347366.683","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 1
Abstract
Objective This study aimed to develop and validate a deep learning (DL) algorithm for the quantitative measurement of thoracolumbar (TL) fracture features, and to evaluate its efficacy across varying levels of clinical expertise. Methods Using the pretrained Mask Region-Based Convolutional Neural Networks model, originally developed for vertebral body segmentation and fracture detection, we fine-tuned the model and added a new module for measuring fracture metrics—compression rate (CR), Cobb angle (CA), Gardner angle (GA), and sagittal index (SI)—from lumbar spine lateral radiographs. These metrics were derived from six-point labeling by 3 radiologists, forming the ground truth (GT). Training utilized 1,000 nonfractured and 318 fractured radiographs, while validations employed 213 internal and 200 external fractured radiographs. The accuracy of the DL algorithm in quantifying fracture features was evaluated against GT using the intraclass correlation coefficient. Additionally, 4 readers with varying expertise levels, including trainees and an attending spine surgeon, performed measurements with and without DL assistance, and their results were compared to GT and the DL model. Results The DL algorithm demonstrated good to excellent agreement with GT for CR, CA, GA, and SI in both internal (0.860, 0.944, 0.932, and 0.779, respectively) and external (0.836, 0.940, 0.916, and 0.815, respectively) validations. DL-assisted measurements significantly improved most measurement values, particularly for trainees. Conclusion The DL algorithm was validated as an accurate tool for quantifying TL fracture features using radiographs. DL-assisted measurement is expected to expedite the diagnostic process and enhance reliability, particularly benefiting less experienced clinicians.