一种多模态相似性感知和知识驱动的可靠尘肺诊断预训练方法。

IF 1.4 3区医学 Q3 INSTRUMENTS & INSTRUMENTATION

Journal of X-Ray Science and Technology Pub Date : 2025-01-01 Epub Date: 2025-01-13 DOI:10.1177/08953996241296400

Xueting Ren, Guohua Ji, Surong Chu, Shinichi Yoshida, Juanjuan Zhao, Baoping Jia, Yan Qiang

{"title":"一种多模态相似性感知和知识驱动的可靠尘肺诊断预训练方法。","authors":"Xueting Ren, Guohua Ji, Surong Chu, Shinichi Yoshida, Juanjuan Zhao, Baoping Jia, Yan Qiang","doi":"10.1177/08953996241296400","DOIUrl":null,"url":null,"abstract":"Background: Pneumoconiosis staging is challenging due to the low clarity of X-ray images and the small, diffuse nature of the lesions. Additionally, the scarcity of annotated data makes it difficult to develop accurate staging models. Although clinical text reports provide valuable contextual information, existing works primarily focus on designing multimodal image-text contrastive learning tasks, neglecting the high similarity of pneumoconiosis imaging representations. This results in inadequate extraction of fine-grained multimodal information and underutilization of domain knowledge, limiting their application in medical tasks.Objective: The study aims to address the limitations of current multimodal methods by proposing a new approach that improves the precision of pneumoconiosis diagnosis and staging through enhanced fine-grained learning and better utilization of domain knowledge.Methods: The proposed Multimodal Similarity-aware and Knowledge-driven Pre-Training (MSK-PT) approach involves two stages. In the first stage, we deeply analyze the similar features of pneumoconiosis images and use a similarity-aware modality alignment strategy to explore the fine-grained representations and associated disturbances of pneumoconiosis lesions between images and texts, guiding the model to match more appropriate feature representations. In the second stage, we utilize data-associated features and pre-stored domain knowledge features as priors and constraints to guide the downstream model in the visual domain without annotations. To address potential erroneous labels generated by model predictions, we further introduce an uncertainty threshold strategy to mitigate the negative impact of imperfect prediction labels and enhance model interpretability.Results: We collected and created the pneumoconiosis chest X-ray (PneumoCXR) dataset to evaluate our proposed MSK-PT method. The experimental results show that our method achieved a classification accuracy of 81.73%, outperforming the state-of-the-art algorithms by 2.53%.Conclusions: MSK-PT showed diagnostic performance that matches or exceeds the average radiologist's level, even with limited labeled data, highlighting the method's effectiveness and robustness.","PeriodicalId":49948,"journal":{"name":"Journal of X-Ray Science and Technology","volume":" ","pages":"229-248"},"PeriodicalIF":1.4000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A multimodal similarity-aware and knowledge-driven pre-training approach for reliable pneumoconiosis diagnosis.\",\"authors\":\"Xueting Ren, Guohua Ji, Surong Chu, Shinichi Yoshida, Juanjuan Zhao, Baoping Jia, Yan Qiang\",\"doi\":\"10.1177/08953996241296400\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Pneumoconiosis staging is challenging due to the low clarity of X-ray images and the small, diffuse nature of the lesions. Additionally, the scarcity of annotated data makes it difficult to develop accurate staging models. Although clinical text reports provide valuable contextual information, existing works primarily focus on designing multimodal image-text contrastive learning tasks, neglecting the high similarity of pneumoconiosis imaging representations. This results in inadequate extraction of fine-grained multimodal information and underutilization of domain knowledge, limiting their application in medical tasks.Objective: The study aims to address the limitations of current multimodal methods by proposing a new approach that improves the precision of pneumoconiosis diagnosis and staging through enhanced fine-grained learning and better utilization of domain knowledge.Methods: The proposed Multimodal Similarity-aware and Knowledge-driven Pre-Training (MSK-PT) approach involves two stages. In the first stage, we deeply analyze the similar features of pneumoconiosis images and use a similarity-aware modality alignment strategy to explore the fine-grained representations and associated disturbances of pneumoconiosis lesions between images and texts, guiding the model to match more appropriate feature representations. In the second stage, we utilize data-associated features and pre-stored domain knowledge features as priors and constraints to guide the downstream model in the visual domain without annotations. To address potential erroneous labels generated by model predictions, we further introduce an uncertainty threshold strategy to mitigate the negative impact of imperfect prediction labels and enhance model interpretability.Results: We collected and created the pneumoconiosis chest X-ray (PneumoCXR) dataset to evaluate our proposed MSK-PT method. The experimental results show that our method achieved a classification accuracy of 81.73%, outperforming the state-of-the-art algorithms by 2.53%.Conclusions: MSK-PT showed diagnostic performance that matches or exceeds the average radiologist's level, even with limited labeled data, highlighting the method's effectiveness and robustness.\",\"PeriodicalId\":49948,\"journal\":{\"name\":\"Journal of X-Ray Science and Technology\",\"volume\":\" \",\"pages\":\"229-248\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of X-Ray Science and Technology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1177/08953996241296400\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/13 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"INSTRUMENTS & INSTRUMENTATION\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of X-Ray Science and Technology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/08953996241296400","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/13 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"INSTRUMENTS & INSTRUMENTATION","Score":null,"Total":0}

引用次数: 0

摘要

背景：尘肺分期是具有挑战性的，由于x线图像的低清晰度和小，弥漫性病变。此外，注释数据的稀缺性使得开发准确的分期模型变得困难。虽然临床文本报告提供了有价值的上下文信息，但现有的工作主要集中在设计多模态图像-文本对比学习任务，忽略了尘肺成像表征的高度相似性。这导致细粒度多模态信息的提取不足和领域知识的利用不足，限制了它们在医疗任务中的应用。目的：本研究旨在通过加强细粒度学习和更好地利用领域知识，解决当前多模式方法的局限性，提出一种新的方法来提高尘肺诊断和分期的准确性。方法：提出的多模态相似性感知和知识驱动预训练（MSK-PT）方法分为两个阶段。在第一阶段，我们深入分析尘肺图像的相似特征，并使用相似感知的模态对齐策略来探索尘肺图像和文本之间的细粒度表征和相关干扰，指导模型匹配更合适的特征表征。在第二阶段，我们利用数据关联特征和预先存储的领域知识特征作为先验和约束，在没有标注的视觉域中指导下游模型。为了解决模型预测产生的潜在错误标签，我们进一步引入了不确定性阈值策略，以减轻不完美预测标签的负面影响，提高模型的可解释性。结果：我们收集并创建了尘肺胸部x射线（肺炎cxr）数据集来评估我们提出的MSK-PT方法。实验结果表明，该方法的分类准确率为81.73%，比现有算法高2.53%。结论：即使在有限的标记数据下，MSK-PT显示的诊断性能也符合或超过放射科医生的平均水平，突出了该方法的有效性和稳健性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A multimodal similarity-aware and knowledge-driven pre-training approach for reliable pneumoconiosis diagnosis.

Background: Pneumoconiosis staging is challenging due to the low clarity of X-ray images and the small, diffuse nature of the lesions. Additionally, the scarcity of annotated data makes it difficult to develop accurate staging models. Although clinical text reports provide valuable contextual information, existing works primarily focus on designing multimodal image-text contrastive learning tasks, neglecting the high similarity of pneumoconiosis imaging representations. This results in inadequate extraction of fine-grained multimodal information and underutilization of domain knowledge, limiting their application in medical tasks.

Objective: The study aims to address the limitations of current multimodal methods by proposing a new approach that improves the precision of pneumoconiosis diagnosis and staging through enhanced fine-grained learning and better utilization of domain knowledge.

Methods: The proposed Multimodal Similarity-aware and Knowledge-driven Pre-Training (MSK-PT) approach involves two stages. In the first stage, we deeply analyze the similar features of pneumoconiosis images and use a similarity-aware modality alignment strategy to explore the fine-grained representations and associated disturbances of pneumoconiosis lesions between images and texts, guiding the model to match more appropriate feature representations. In the second stage, we utilize data-associated features and pre-stored domain knowledge features as priors and constraints to guide the downstream model in the visual domain without annotations. To address potential erroneous labels generated by model predictions, we further introduce an uncertainty threshold strategy to mitigate the negative impact of imperfect prediction labels and enhance model interpretability.

Results: We collected and created the pneumoconiosis chest X-ray (PneumoCXR) dataset to evaluate our proposed MSK-PT method. The experimental results show that our method achieved a classification accuracy of 81.73%, outperforming the state-of-the-art algorithms by 2.53%.

Conclusions: MSK-PT showed diagnostic performance that matches or exceeds the average radiologist's level, even with limited labeled data, highlighting the method's effectiveness and robustness.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of X-Ray Science and Technology 工程技术-光学

CiteScore

4.90

自引率

23.30%

发文量

150

审稿时长

3 months

期刊介绍： Research areas within the scope of the journal include: Interaction of x-rays with matter: x-ray phenomena, biological effects of radiation, radiation safety and optical constants X-ray sources: x-rays from synchrotrons, x-ray lasers, plasmas, and other sources, conventional or unconventional Optical elements: grazing incidence optics, multilayer mirrors, zone plates, gratings, other diffraction optics Optical instruments: interferometers, spectrometers, microscopes, telescopes, microprobes