利用x光片进行小样本训练的预训练方法（PASTER）：由胸部x光片和自由文本报告训练的多模态变压器。

IF 5.7 3区医学 Q1 HEALTH CARE SCIENCES & SERVICES

Journal of Medical Systems Pub Date : 2025-09-30 DOI:10.1007/s10916-025-02263-3

Kai-Chieh Chen, Matthew Kuo, Chun-Ho Lee, Hao-Chun Liao, Dung-Jang Tsai, Shing-An Lin, Chih-Wei Hsiang, Cheng-Kuang Chang, Kai-Hsiung Ko, Yi-Chih Hsu, Wei-Chou Chang, Guo-Shu Huang, Wen-Hui Fang, Chin-Sheng Lin, Shih-Hua Lin, Yuan-Hao Chen, Yi-Jen Hung, Chien-Sung Tsai, Chin Lin

{"title":"利用x光片进行小样本训练的预训练方法（PASTER）：由胸部x光片和自由文本报告训练的多模态变压器。","authors":"Kai-Chieh Chen, Matthew Kuo, Chun-Ho Lee, Hao-Chun Liao, Dung-Jang Tsai, Shing-An Lin, Chih-Wei Hsiang, Cheng-Kuang Chang, Kai-Hsiung Ko, Yi-Chih Hsu, Wei-Chou Chang, Guo-Shu Huang, Wen-Hui Fang, Chin-Sheng Lin, Shih-Hua Lin, Yuan-Hao Chen, Yi-Jen Hung, Chien-Sung Tsai, Chin Lin","doi":"10.1007/s10916-025-02263-3","DOIUrl":null,"url":null,"abstract":"While deep convolutional neural networks (DCNNs) have achieved remarkable performance in chest X-ray interpretation, their success typically depends on access to large-scale, expertly annotated datasets. However, collecting such data in real-world clinical settings can be difficult because of limited labeling resources, privacy concerns, and patient variability. In this study, we applied a multimodal Transformer pretrained on free-text reports and their paired CXRs to evaluate the effectiveness of this method in settings with limited labeled data. Our dataset consisted of more than 1 million CXRs, each accompanied by reports from board-certified radiologists and 31 structured labels. The results indicated that a linear model trained on embeddings from the pretrained model achieved AUCs of 0.907 and 0.903 on internal and external test sets, respectively, using only 128 cases and 384 controls; the results were comparable those of DenseNet trained on the entire dataset, whose AUCs were 0.908 and 0.903, respectively. Additionally, we demonstrated similar results by extending the application of this approach to a subset annotated with structured echocardiographic reports. Furthermore, this multimodal model exhibited excellent small sample learning capabilities when tested on external validation sets such as CheXpert and ChestX-ray14. This research significantly reduces the sample size necessary for future artificial intelligence advancements in CXR interpretation.","PeriodicalId":16338,"journal":{"name":"Journal of Medical Systems","volume":"49 1","pages":"120"},"PeriodicalIF":5.7000,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Pretraining Approach for Small-sample Training Employing Radiographs (PASTER): a Multimodal Transformer Trained by Chest Radiography and Free-text Reports.\",\"authors\":\"Kai-Chieh Chen, Matthew Kuo, Chun-Ho Lee, Hao-Chun Liao, Dung-Jang Tsai, Shing-An Lin, Chih-Wei Hsiang, Cheng-Kuang Chang, Kai-Hsiung Ko, Yi-Chih Hsu, Wei-Chou Chang, Guo-Shu Huang, Wen-Hui Fang, Chin-Sheng Lin, Shih-Hua Lin, Yuan-Hao Chen, Yi-Jen Hung, Chien-Sung Tsai, Chin Lin\",\"doi\":\"10.1007/s10916-025-02263-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"While deep convolutional neural networks (DCNNs) have achieved remarkable performance in chest X-ray interpretation, their success typically depends on access to large-scale, expertly annotated datasets. However, collecting such data in real-world clinical settings can be difficult because of limited labeling resources, privacy concerns, and patient variability. In this study, we applied a multimodal Transformer pretrained on free-text reports and their paired CXRs to evaluate the effectiveness of this method in settings with limited labeled data. Our dataset consisted of more than 1 million CXRs, each accompanied by reports from board-certified radiologists and 31 structured labels. The results indicated that a linear model trained on embeddings from the pretrained model achieved AUCs of 0.907 and 0.903 on internal and external test sets, respectively, using only 128 cases and 384 controls; the results were comparable those of DenseNet trained on the entire dataset, whose AUCs were 0.908 and 0.903, respectively. Additionally, we demonstrated similar results by extending the application of this approach to a subset annotated with structured echocardiographic reports. Furthermore, this multimodal model exhibited excellent small sample learning capabilities when tested on external validation sets such as CheXpert and ChestX-ray14. This research significantly reduces the sample size necessary for future artificial intelligence advancements in CXR interpretation.\",\"PeriodicalId\":16338,\"journal\":{\"name\":\"Journal of Medical Systems\",\"volume\":\"49 1\",\"pages\":\"120\"},\"PeriodicalIF\":5.7000,\"publicationDate\":\"2025-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Medical Systems\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s10916-025-02263-3\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Systems","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s10916-025-02263-3","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

摘要

虽然深度卷积神经网络（DCNNs）在胸部x射线解释方面取得了显著的成绩，但它们的成功通常取决于对大规模、专业注释数据集的访问。然而，在现实世界的临床环境中收集这些数据可能很困难，因为有限的标签资源、隐私问题和患者的可变性。在这项研究中，我们应用了一个多模态Transformer对自由文本报告及其配对cxr进行预训练，以评估该方法在有限标记数据设置中的有效性。我们的数据集由100多万例cxr组成，每个cxr都附有委员会认证的放射科医生的报告和31个结构化标签。结果表明，在128例病例和384例对照中，基于预训练模型的嵌入训练的线性模型在内部和外部测试集上的auc分别为0.907和0.903；结果与DenseNet在整个数据集上训练的结果相当，其auc分别为0.908和0.903。此外，我们通过将这种方法的应用扩展到一个有结构化超声心动图报告注释的子集，证明了类似的结果。此外，当在外部验证集（如CheXpert和ChestX-ray14）上进行测试时，该多模态模型表现出出色的小样本学习能力。这项研究大大减少了未来人工智能在CXR解释方面取得进展所需的样本量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Pretraining Approach for Small-sample Training Employing Radiographs (PASTER): a Multimodal Transformer Trained by Chest Radiography and Free-text Reports.

While deep convolutional neural networks (DCNNs) have achieved remarkable performance in chest X-ray interpretation, their success typically depends on access to large-scale, expertly annotated datasets. However, collecting such data in real-world clinical settings can be difficult because of limited labeling resources, privacy concerns, and patient variability. In this study, we applied a multimodal Transformer pretrained on free-text reports and their paired CXRs to evaluate the effectiveness of this method in settings with limited labeled data. Our dataset consisted of more than 1 million CXRs, each accompanied by reports from board-certified radiologists and 31 structured labels. The results indicated that a linear model trained on embeddings from the pretrained model achieved AUCs of 0.907 and 0.903 on internal and external test sets, respectively, using only 128 cases and 384 controls; the results were comparable those of DenseNet trained on the entire dataset, whose AUCs were 0.908 and 0.903, respectively. Additionally, we demonstrated similar results by extending the application of this approach to a subset annotated with structured echocardiographic reports. Furthermore, this multimodal model exhibited excellent small sample learning capabilities when tested on external validation sets such as CheXpert and ChestX-ray14. This research significantly reduces the sample size necessary for future artificial intelligence advancements in CXR interpretation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Medical Systems 医学-卫生保健

CiteScore

11.60

自引率

1.90%

发文量

审稿时长

4.8 months

期刊介绍： Journal of Medical Systems provides a forum for the presentation and discussion of the increasingly extensive applications of new systems techniques and methods in hospital clinic and physician''s office administration; pathology radiology and pharmaceutical delivery systems; medical records storage and retrieval; and ancillary patient-support systems. The journal publishes informative articles essays and studies across the entire scale of medical systems from large hospital programs to novel small-scale medical services. Education is an integral part of this amalgamation of sciences and selected articles are published in this area. Since existing medical systems are constantly being modified to fit particular circumstances and to solve specific problems the journal includes a special section devoted to status reports on current installations.