结合深度学习椎体输出和有限人口统计学数据的主体级脊柱骨质疏松性骨折预测

IF 3.1 3区医学 Q2 ENDOCRINOLOGY & METABOLISM

Archives of Osteoporosis Pub Date : 2024-09-10 DOI:10.1007/s11657-024-01433-z

Nathan M. Cross, Jessica Perry, Qifei Dong, Gang Luo, Jonathan Renslo, Brian C. Chang, Nancy E. Lane, Lynn Marshall, Sandra K. Johnston, David R. Haynor, Jeffrey G. Jarvik, Patrick J. Heagerty

{"title":"结合深度学习椎体输出和有限人口统计学数据的主体级脊柱骨质疏松性骨折预测","authors":"Nathan M. Cross, Jessica Perry, Qifei Dong, Gang Luo, Jonathan Renslo, Brian C. Chang, Nancy E. Lane, Lynn Marshall, Sandra K. Johnston, David R. Haynor, Jeffrey G. Jarvik, Patrick J. Heagerty","doi":"10.1007/s11657-024-01433-z","DOIUrl":null,"url":null,"abstract":"<div><h3>\n <i>Summary</i>\n </h3><p>Automated screening for vertebral fractures could improve outcomes. We achieved an AUC-ROC = 0.968 for the prediction of moderate to severe fracture using a GAM with age and three maximal vertebral body scores of fracture from a convolutional neural network. Maximal fracture scores resulted in a performant model for subject-level fracture prediction. Combining individual deep learning vertebral body fracture scores and demographic covariates for subject-level classification of osteoporotic fracture achieved excellent performance (AUC-ROC of 0.968) on a large dataset of radiographs with basic demographic data.</p><h3>Purpose</h3><p>Osteoporotic vertebral fractures are common and morbid. Automated opportunistic screening for incidental vertebral fractures from radiographs, the highest volume imaging modality, could improve osteoporosis detection and management. We consider how to form patient-level fracture predictions and summarization to guide management, using our previously developed vertebral fracture classifier on segmented radiographs from a prospective cohort study of US men (MrOS). We compare the performance of logistic regression (LR) and generalized additive models (GAM) with combinations of individual vertebral scores and basic demographic covariates.</p><h3>Methods</h3><p>Subject-level LR and GAM models were created retrospectively using all fracture predictions or summary variables such as order statistics, adjacent vertebral interactions, and demographic covariates (age, race/ethnicity). The classifier outputs for 8663 vertebrae from 1176 thoracic and lumbar radiographs in 669 subjects were divided by subject to perform stratified fivefold cross-validation. Models were assessed using multiple metrics, including receiver operating characteristic (ROC) and precision-recall (PR) curves.</p><h3>Results</h3><p>The best model (AUC-ROC = 0.968) was a GAM using the top three maximum vertebral fracture scores and age. Using top-ranked scores only, rather than all vertebral scores, improved performance for both model classes. Adding age, but not ethnicity, to the GAMs improved performance slightly.</p><h3>Conclusion</h3><p>Maximal vertebral fracture scores resulted in the highest-performing models. While combining multiple vertebral body predictions risks decreasing specificity, our results demonstrate that subject-level models maintain good predictive performance. Thresholding strategies can be used to control sensitivity and specificity as clinically appropriate.</p></div>","PeriodicalId":8283,"journal":{"name":"Archives of Osteoporosis","volume":"19 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Subject-level spinal osteoporotic fracture prediction combining deep learning vertebral outputs and limited demographic data\",\"authors\":\"Nathan M. Cross, Jessica Perry, Qifei Dong, Gang Luo, Jonathan Renslo, Brian C. Chang, Nancy E. Lane, Lynn Marshall, Sandra K. Johnston, David R. Haynor, Jeffrey G. Jarvik, Patrick J. Heagerty\",\"doi\":\"10.1007/s11657-024-01433-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>\\n <i>Summary</i>\\n </h3><p>Automated screening for vertebral fractures could improve outcomes. We achieved an AUC-ROC = 0.968 for the prediction of moderate to severe fracture using a GAM with age and three maximal vertebral body scores of fracture from a convolutional neural network. Maximal fracture scores resulted in a performant model for subject-level fracture prediction. Combining individual deep learning vertebral body fracture scores and demographic covariates for subject-level classification of osteoporotic fracture achieved excellent performance (AUC-ROC of 0.968) on a large dataset of radiographs with basic demographic data.</p><h3>Purpose</h3><p>Osteoporotic vertebral fractures are common and morbid. Automated opportunistic screening for incidental vertebral fractures from radiographs, the highest volume imaging modality, could improve osteoporosis detection and management. We consider how to form patient-level fracture predictions and summarization to guide management, using our previously developed vertebral fracture classifier on segmented radiographs from a prospective cohort study of US men (MrOS). We compare the performance of logistic regression (LR) and generalized additive models (GAM) with combinations of individual vertebral scores and basic demographic covariates.</p><h3>Methods</h3><p>Subject-level LR and GAM models were created retrospectively using all fracture predictions or summary variables such as order statistics, adjacent vertebral interactions, and demographic covariates (age, race/ethnicity). The classifier outputs for 8663 vertebrae from 1176 thoracic and lumbar radiographs in 669 subjects were divided by subject to perform stratified fivefold cross-validation. Models were assessed using multiple metrics, including receiver operating characteristic (ROC) and precision-recall (PR) curves.</p><h3>Results</h3><p>The best model (AUC-ROC = 0.968) was a GAM using the top three maximum vertebral fracture scores and age. Using top-ranked scores only, rather than all vertebral scores, improved performance for both model classes. Adding age, but not ethnicity, to the GAMs improved performance slightly.</p><h3>Conclusion</h3><p>Maximal vertebral fracture scores resulted in the highest-performing models. While combining multiple vertebral body predictions risks decreasing specificity, our results demonstrate that subject-level models maintain good predictive performance. Thresholding strategies can be used to control sensitivity and specificity as clinically appropriate.</p></div>\",\"PeriodicalId\":8283,\"journal\":{\"name\":\"Archives of Osteoporosis\",\"volume\":\"19 1\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Archives of Osteoporosis\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s11657-024-01433-z\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENDOCRINOLOGY & METABOLISM\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Archives of Osteoporosis","FirstCategoryId":"3","ListUrlMain":"https://link.springer.com/article/10.1007/s11657-024-01433-z","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENDOCRINOLOGY & METABOLISM","Score":null,"Total":0}

引用次数: 0

摘要

摘要椎体骨折的自动筛查可改善预后。我们使用卷积神经网络中的年龄和三个最大椎体骨折评分的 GAM 预测中重度骨折，AUC-ROC = 0.968。最大椎体骨折评分为受试者骨折预测提供了一个性能良好的模型。将单个深度学习椎体骨折评分与人口统计学协变量相结合，用于骨质疏松性骨折的受试者级别分类，在具有基本人口统计学数据的大型X光片数据集上取得了优异的性能（AUC-ROC为0.968）。从影像学检查中自动筛查偶然发生的椎体骨折可改善骨质疏松症的检测和管理。我们利用之前开发的椎体骨折分类器，对美国男性前瞻性队列研究（MrOS）中的分段放射影像进行分析，考虑如何形成患者级别的骨折预测和总结，以指导管理。我们比较了逻辑回归（LR）和广义相加模型（GAM）与单个椎体评分和基本人口统计学协变量组合的性能。方法使用所有骨折预测或汇总变量（如阶次统计、相邻椎体相互作用和人口统计学协变量（年龄、种族/民族））回顾性地创建受试者级别的 LR 和 GAM 模型。对 669 名受试者的 1176 张胸椎和腰椎 X 光片中 8663 个椎体的分类器输出结果按受试者进行分层五倍交叉验证。结果最佳模型（AUC-ROC = 0.968）是使用前三名最大椎体骨折评分和年龄的 GAM。仅使用排名靠前的得分而不是所有椎体得分，可提高两类模型的性能。在 GAM 中加入年龄（而非种族）可略微提高性能。虽然结合多个椎体预测可能会降低特异性，但我们的结果表明，主体级模型仍能保持良好的预测性能。阈值策略可用于控制灵敏度和特异性，以符合临床需要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

$Subject-level spinal osteoporotic fracture prediction combining deep learning vertebral outputs and limited demographic data$

查看原文本刊更多论文

Subject-level spinal osteoporotic fracture prediction combining deep learning vertebral outputs and limited demographic data

Summary

Automated screening for vertebral fractures could improve outcomes. We achieved an AUC-ROC = 0.968 for the prediction of moderate to severe fracture using a GAM with age and three maximal vertebral body scores of fracture from a convolutional neural network. Maximal fracture scores resulted in a performant model for subject-level fracture prediction. Combining individual deep learning vertebral body fracture scores and demographic covariates for subject-level classification of osteoporotic fracture achieved excellent performance (AUC-ROC of 0.968) on a large dataset of radiographs with basic demographic data.

Purpose

Osteoporotic vertebral fractures are common and morbid. Automated opportunistic screening for incidental vertebral fractures from radiographs, the highest volume imaging modality, could improve osteoporosis detection and management. We consider how to form patient-level fracture predictions and summarization to guide management, using our previously developed vertebral fracture classifier on segmented radiographs from a prospective cohort study of US men (MrOS). We compare the performance of logistic regression (LR) and generalized additive models (GAM) with combinations of individual vertebral scores and basic demographic covariates.

Methods

Subject-level LR and GAM models were created retrospectively using all fracture predictions or summary variables such as order statistics, adjacent vertebral interactions, and demographic covariates (age, race/ethnicity). The classifier outputs for 8663 vertebrae from 1176 thoracic and lumbar radiographs in 669 subjects were divided by subject to perform stratified fivefold cross-validation. Models were assessed using multiple metrics, including receiver operating characteristic (ROC) and precision-recall (PR) curves.

Results

The best model (AUC-ROC = 0.968) was a GAM using the top three maximum vertebral fracture scores and age. Using top-ranked scores only, rather than all vertebral scores, improved performance for both model classes. Adding age, but not ethnicity, to the GAMs improved performance slightly.

Conclusion

Maximal vertebral fracture scores resulted in the highest-performing models. While combining multiple vertebral body predictions risks decreasing specificity, our results demonstrate that subject-level models maintain good predictive performance. Thresholding strategies can be used to control sensitivity and specificity as clinically appropriate.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Archives of Osteoporosis ENDOCRINOLOGY & METABOLISMORTHOPEDICS -ORTHOPEDICS

CiteScore

5.50

自引率

10.00%

发文量

133

期刊介绍： Archives of Osteoporosis is an international multidisciplinary journal which is a joint initiative of the International Osteoporosis Foundation and the National Osteoporosis Foundation of the USA. The journal will highlight the specificities of different regions around the world concerning epidemiology, reference values for bone density and bone metabolism, as well as clinical aspects of osteoporosis and other bone diseases.