融合x射线图像和临床数据的骨质疏松症多模态深度学习预测模型：算法开发和验证研究。

IF 3.8 3区医学 Q2 MEDICAL INFORMATICS

JMIR Medical Informatics Pub Date : 2025-09-18 DOI:10.2196/70738

Jun Tang, Xiang Yin, Jiangyuan Lai, Keyu Luo, Dongdong Wu

{"title":"融合x射线图像和临床数据的骨质疏松症多模态深度学习预测模型：算法开发和验证研究。","authors":"Jun Tang, Xiang Yin, Jiangyuan Lai, Keyu Luo, Dongdong Wu","doi":"10.2196/70738","DOIUrl":null,"url":null,"abstract":"Background: Osteoporosis is a bone disease characterized by reduced bone mineral density and mass, which increase the risk of fragility fractures in patients. Artificial intelligence can mine imaging features specific to different bone densities, shapes, and structures and fuse other multimodal features for synergistic diagnosis to improve prediction accuracy.Objective: This study aims to develop a multimodal model that fuses chest X-rays and clinical parameters for opportunistic screening of osteoporosis and to compare and analyze the experimental results with existing methods.Methods: We used multimodal data, including chest X-ray images and clinical data, from a total of 1780 patients at Chongqing Daping Hospital from January 2019 to August 2024. We adopted a probability fusion strategy to construct a multimodal model. In our model, we used a convolutional neural network as the backbone network for image processing and fine-tuned it using a transfer learning technique to suit the specific task of this study. In addition, we introduced a gradient-based wavelet feature extraction method. We combined it with an attention mechanism to assist in feature fusion, which enhanced the model's focus on key regions of the image and further improved its ability to extract image features.Results: The multimodal model proposed in this paper outperforms the traditional methods in the 4 evaluation metrics of area under the curve value, accuracy, sensitivity, and specificity. Compared with using only the X-ray image model, the multimodal model improved the area under the curve value significantly from 0.951 to 0.975 (P=.004), the accuracy from 89.32% to 92.36% (P=.045), the sensitivity from 89.82% to 91.23% (P=.03), and the specificity from 88.64% to 93.92% (P=.008).Conclusions: While the multimodal model that fuses chest X-ray images and clinical data demonstrated superior performance compared to unimodal models and traditional methods, this study has several limitations. The dataset size may not be sufficient to capture the full diversity of the population. The retrospective nature of the study may introduce selection bias, and the lack of external validation limits the generalizability of the findings. Future studies should address these limitations by incorporating larger, more diverse datasets and conducting rigorous external validation to further establish the model's clinical use.","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e70738"},"PeriodicalIF":3.8000,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12445622/pdf/","citationCount":"0","resultStr":"{\"title\":\"Fusion of X-Ray Images and Clinical Data for a Multimodal Deep Learning Prediction Model of Osteoporosis: Algorithm Development and Validation Study.\",\"authors\":\"Jun Tang, Xiang Yin, Jiangyuan Lai, Keyu Luo, Dongdong Wu\",\"doi\":\"10.2196/70738\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Osteoporosis is a bone disease characterized by reduced bone mineral density and mass, which increase the risk of fragility fractures in patients. Artificial intelligence can mine imaging features specific to different bone densities, shapes, and structures and fuse other multimodal features for synergistic diagnosis to improve prediction accuracy.Objective: This study aims to develop a multimodal model that fuses chest X-rays and clinical parameters for opportunistic screening of osteoporosis and to compare and analyze the experimental results with existing methods.Methods: We used multimodal data, including chest X-ray images and clinical data, from a total of 1780 patients at Chongqing Daping Hospital from January 2019 to August 2024. We adopted a probability fusion strategy to construct a multimodal model. In our model, we used a convolutional neural network as the backbone network for image processing and fine-tuned it using a transfer learning technique to suit the specific task of this study. In addition, we introduced a gradient-based wavelet feature extraction method. We combined it with an attention mechanism to assist in feature fusion, which enhanced the model's focus on key regions of the image and further improved its ability to extract image features.Results: The multimodal model proposed in this paper outperforms the traditional methods in the 4 evaluation metrics of area under the curve value, accuracy, sensitivity, and specificity. Compared with using only the X-ray image model, the multimodal model improved the area under the curve value significantly from 0.951 to 0.975 (P=.004), the accuracy from 89.32% to 92.36% (P=.045), the sensitivity from 89.82% to 91.23% (P=.03), and the specificity from 88.64% to 93.92% (P=.008).Conclusions: While the multimodal model that fuses chest X-ray images and clinical data demonstrated superior performance compared to unimodal models and traditional methods, this study has several limitations. The dataset size may not be sufficient to capture the full diversity of the population. The retrospective nature of the study may introduce selection bias, and the lack of external validation limits the generalizability of the findings. Future studies should address these limitations by incorporating larger, more diverse datasets and conducting rigorous external validation to further establish the model's clinical use.\",\"PeriodicalId\":56334,\"journal\":{\"name\":\"JMIR Medical Informatics\",\"volume\":\"13 \",\"pages\":\"e70738\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2025-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12445622/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JMIR Medical Informatics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.2196/70738\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/70738","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}

引用次数: 0

摘要

背景：骨质疏松症是一种以骨密度和骨量降低为特征的骨骼疾病，这增加了患者脆性骨折的风险。人工智能可以挖掘针对不同骨密度、形状和结构的成像特征，并融合其他多模态特征进行协同诊断，以提高预测精度。目的：本研究旨在建立融合胸部x线和临床参数的骨质疏松症机会筛查的多模态模型，并将实验结果与现有方法进行比较分析。方法：利用2019年1月至2024年8月重庆大坪医院共1780例患者的胸部x线图像和临床资料等多模式数据。我们采用概率融合策略来构建多模态模型。在我们的模型中，我们使用卷积神经网络作为图像处理的骨干网络，并使用迁移学习技术对其进行微调，以适应本研究的特定任务。此外，我们还介绍了一种基于梯度的小波特征提取方法。我们结合注意机制辅助特征融合，增强了模型对图像关键区域的关注，进一步提高了模型提取图像特征的能力。结果：本文提出的多模态模型在曲线下面积值、准确性、灵敏度和特异性4个评价指标上均优于传统方法。与单纯x线影像模型相比，多模态模型的曲线下面积值从0.951提高到0.975 (P= 0.004)，准确率从89.32%提高到92.36% (P= 0.045)，灵敏度从89.82%提高到91.23% (P= 0.03)，特异度从88.64%提高到93.92% （P= 0.008）。结论：虽然融合胸部x线图像和临床数据的多模态模型与单模态模型和传统方法相比表现出优越的性能，但本研究存在一些局限性。数据集的大小可能不足以捕捉种群的全部多样性。该研究的回顾性可能会引入选择偏倚，并且缺乏外部验证限制了研究结果的可推广性。未来的研究应该通过纳入更大、更多样化的数据集和进行严格的外部验证来解决这些局限性，以进一步建立模型的临床应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Fusion of X-Ray Images and Clinical Data for a Multimodal Deep Learning Prediction Model of Osteoporosis: Algorithm Development and Validation Study.

查看原文本刊更多论文

Fusion of X-Ray Images and Clinical Data for a Multimodal Deep Learning Prediction Model of Osteoporosis: Algorithm Development and Validation Study.

Background: Osteoporosis is a bone disease characterized by reduced bone mineral density and mass, which increase the risk of fragility fractures in patients. Artificial intelligence can mine imaging features specific to different bone densities, shapes, and structures and fuse other multimodal features for synergistic diagnosis to improve prediction accuracy.

Objective: This study aims to develop a multimodal model that fuses chest X-rays and clinical parameters for opportunistic screening of osteoporosis and to compare and analyze the experimental results with existing methods.

Methods: We used multimodal data, including chest X-ray images and clinical data, from a total of 1780 patients at Chongqing Daping Hospital from January 2019 to August 2024. We adopted a probability fusion strategy to construct a multimodal model. In our model, we used a convolutional neural network as the backbone network for image processing and fine-tuned it using a transfer learning technique to suit the specific task of this study. In addition, we introduced a gradient-based wavelet feature extraction method. We combined it with an attention mechanism to assist in feature fusion, which enhanced the model's focus on key regions of the image and further improved its ability to extract image features.

Results: The multimodal model proposed in this paper outperforms the traditional methods in the 4 evaluation metrics of area under the curve value, accuracy, sensitivity, and specificity. Compared with using only the X-ray image model, the multimodal model improved the area under the curve value significantly from 0.951 to 0.975 (P=.004), the accuracy from 89.32% to 92.36% (P=.045), the sensitivity from 89.82% to 91.23% (P=.03), and the specificity from 88.64% to 93.92% (P=.008).

Conclusions: While the multimodal model that fuses chest X-ray images and clinical data demonstrated superior performance compared to unimodal models and traditional methods, this study has several limitations. The dataset size may not be sufficient to capture the full diversity of the population. The retrospective nature of the study may introduce selection bias, and the lack of external validation limits the generalizability of the findings. Future studies should address these limitations by incorporating larger, more diverse datasets and conducting rigorous external validation to further establish the model's clinical use.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

JMIR Medical Informatics Medicine-Health Informatics

CiteScore

7.90

自引率

3.10%

发文量

173

审稿时长

12 weeks

期刊介绍： JMIR Medical Informatics (JMI, ISSN 2291-9694) is a top-rated, tier A journal which focuses on clinical informatics, big data in health and health care, decision support for health professionals, electronic health records, ehealth infrastructures and implementation. It has a focus on applied, translational research, with a broad readership including clinicians, CIOs, engineers, industry and health informatics professionals. Published by JMIR Publications, publisher of the Journal of Medical Internet Research (JMIR), the leading eHealth/mHealth journal (Impact Factor 2016: 5.175), JMIR Med Inform has a slightly different scope (emphasizing more on applications for clinicians and health professionals rather than consumers/citizens, which is the focus of JMIR), publishes even faster, and also allows papers which are more technical or more formative than what would be published in the Journal of Medical Internet Research.