{"title":"对肝脏肿瘤诊断稳健的多模态超声分类:模态缺失的生成方法","authors":"Jiali Guo , Rui Bu , Wanting Shen , Tao Feng","doi":"10.1016/j.cmpb.2025.108759","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and Objective</h3><div>In medical image analysis, combining multiple imaging modalities enhances diagnostic accuracy by providing complementary information. However, missing modalities are common in clinical settings, limiting the effectiveness of multimodal models. This study addresses the challenge of missing modalities in liver tumor diagnosis by proposing a generative model-based method for cross-modality reconstruction and classification. The dataset for this study comprises 359 case data from a hospital, with each case including three modality data: B-mode ultrasound images, Color Doppler Flow Imaging (CDFI), and clinical data. Only cases with one missing image modality are considered, excluding those with missing clinical data.</div></div><div><h3>Methods</h3><div>We developed a multimodal classification framework specifically for liver tumor diagnosis, employing various feature extraction networks to explore the impact of different modality combinations on classification performance when only available modalities are used. DenseNet extracts CDFI features, while EfficientNet is employed for B-mode ultrasound image feature extraction. These features are then flattened and concatenated with clinical data using feature-level fusion to obtain a full-modality model. Modality weight parameters are introduced to emphasize the importance of different modalities, yielding Model_D, which serves as the classification model after subsequent image modality supplementation. In cases of missing modalities, generative models, including U-GAT-IT and MSA-GAN, are utilized for cross-modal reconstruction of missing B-mode ultrasound or CDFI images (e.g., reconstructing CDFI from B-mode ultrasound when CDFI is missing). After evaluating the usability of the generated images, they are input into Model_D as supplementary images for the missing modalities.</div></div><div><h3>Results</h3><div>Model performance and modality supplementation effects were evaluated through accuracy, precision, recall, F1 score, and AUC metrics. The results demonstrate that the proposed Model_D, which introduces modality weights, achieves an accuracy of 88.57 %, precision of 87.97 %, recall of 82.32 %, F1 score of 0.87, and AUC of 0.95 in the full-modality classification task for liver tumors. Moreover, images reconstructed using U-GAT-IT and MSA-GAN across modalities exhibit PSNR > 20 and multi-scale structural similarity > 0.7, indicating moderate image quality with well-preserved overall structures, suitable for input into the model as supplementary images in cases of missing modalities. The supplementary CDFI or B-mode ultrasound images achieve 87.10 % and 86.43 % accuracy, respectively, with AUC values of 0.92 and 0.95. This proves that even in the absence of certain modalities, the generative models can effectively reconstruct missing images, maintaining high classification performance comparable to that in complete modality scenarios.</div></div><div><h3>Conclusions</h3><div>The generative model-based approach for modality reconstruction significantly improves the robustness of multimodal classification models, particularly in the context of liver tumor diagnosis. This method enhances the clinical applicability of multimodal models by ensuring high diagnostic accuracy despite missing modalities. Future work will explore further improvements in modality reconstruction techniques to increase the generalization and reliability of the model in various clinical settings.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"265 ","pages":"Article 108759"},"PeriodicalIF":4.9000,"publicationDate":"2025-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards robust multimodal ultrasound classification for liver tumor diagnosis: A generative approach to modality missingness\",\"authors\":\"Jiali Guo , Rui Bu , Wanting Shen , Tao Feng\",\"doi\":\"10.1016/j.cmpb.2025.108759\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background and Objective</h3><div>In medical image analysis, combining multiple imaging modalities enhances diagnostic accuracy by providing complementary information. However, missing modalities are common in clinical settings, limiting the effectiveness of multimodal models. This study addresses the challenge of missing modalities in liver tumor diagnosis by proposing a generative model-based method for cross-modality reconstruction and classification. The dataset for this study comprises 359 case data from a hospital, with each case including three modality data: B-mode ultrasound images, Color Doppler Flow Imaging (CDFI), and clinical data. Only cases with one missing image modality are considered, excluding those with missing clinical data.</div></div><div><h3>Methods</h3><div>We developed a multimodal classification framework specifically for liver tumor diagnosis, employing various feature extraction networks to explore the impact of different modality combinations on classification performance when only available modalities are used. DenseNet extracts CDFI features, while EfficientNet is employed for B-mode ultrasound image feature extraction. These features are then flattened and concatenated with clinical data using feature-level fusion to obtain a full-modality model. Modality weight parameters are introduced to emphasize the importance of different modalities, yielding Model_D, which serves as the classification model after subsequent image modality supplementation. In cases of missing modalities, generative models, including U-GAT-IT and MSA-GAN, are utilized for cross-modal reconstruction of missing B-mode ultrasound or CDFI images (e.g., reconstructing CDFI from B-mode ultrasound when CDFI is missing). After evaluating the usability of the generated images, they are input into Model_D as supplementary images for the missing modalities.</div></div><div><h3>Results</h3><div>Model performance and modality supplementation effects were evaluated through accuracy, precision, recall, F1 score, and AUC metrics. The results demonstrate that the proposed Model_D, which introduces modality weights, achieves an accuracy of 88.57 %, precision of 87.97 %, recall of 82.32 %, F1 score of 0.87, and AUC of 0.95 in the full-modality classification task for liver tumors. Moreover, images reconstructed using U-GAT-IT and MSA-GAN across modalities exhibit PSNR > 20 and multi-scale structural similarity > 0.7, indicating moderate image quality with well-preserved overall structures, suitable for input into the model as supplementary images in cases of missing modalities. The supplementary CDFI or B-mode ultrasound images achieve 87.10 % and 86.43 % accuracy, respectively, with AUC values of 0.92 and 0.95. This proves that even in the absence of certain modalities, the generative models can effectively reconstruct missing images, maintaining high classification performance comparable to that in complete modality scenarios.</div></div><div><h3>Conclusions</h3><div>The generative model-based approach for modality reconstruction significantly improves the robustness of multimodal classification models, particularly in the context of liver tumor diagnosis. This method enhances the clinical applicability of multimodal models by ensuring high diagnostic accuracy despite missing modalities. Future work will explore further improvements in modality reconstruction techniques to increase the generalization and reliability of the model in various clinical settings.</div></div>\",\"PeriodicalId\":10624,\"journal\":{\"name\":\"Computer methods and programs in biomedicine\",\"volume\":\"265 \",\"pages\":\"Article 108759\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2025-03-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer methods and programs in biomedicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0169260725001762\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260725001762","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Towards robust multimodal ultrasound classification for liver tumor diagnosis: A generative approach to modality missingness
Background and Objective
In medical image analysis, combining multiple imaging modalities enhances diagnostic accuracy by providing complementary information. However, missing modalities are common in clinical settings, limiting the effectiveness of multimodal models. This study addresses the challenge of missing modalities in liver tumor diagnosis by proposing a generative model-based method for cross-modality reconstruction and classification. The dataset for this study comprises 359 case data from a hospital, with each case including three modality data: B-mode ultrasound images, Color Doppler Flow Imaging (CDFI), and clinical data. Only cases with one missing image modality are considered, excluding those with missing clinical data.
Methods
We developed a multimodal classification framework specifically for liver tumor diagnosis, employing various feature extraction networks to explore the impact of different modality combinations on classification performance when only available modalities are used. DenseNet extracts CDFI features, while EfficientNet is employed for B-mode ultrasound image feature extraction. These features are then flattened and concatenated with clinical data using feature-level fusion to obtain a full-modality model. Modality weight parameters are introduced to emphasize the importance of different modalities, yielding Model_D, which serves as the classification model after subsequent image modality supplementation. In cases of missing modalities, generative models, including U-GAT-IT and MSA-GAN, are utilized for cross-modal reconstruction of missing B-mode ultrasound or CDFI images (e.g., reconstructing CDFI from B-mode ultrasound when CDFI is missing). After evaluating the usability of the generated images, they are input into Model_D as supplementary images for the missing modalities.
Results
Model performance and modality supplementation effects were evaluated through accuracy, precision, recall, F1 score, and AUC metrics. The results demonstrate that the proposed Model_D, which introduces modality weights, achieves an accuracy of 88.57 %, precision of 87.97 %, recall of 82.32 %, F1 score of 0.87, and AUC of 0.95 in the full-modality classification task for liver tumors. Moreover, images reconstructed using U-GAT-IT and MSA-GAN across modalities exhibit PSNR > 20 and multi-scale structural similarity > 0.7, indicating moderate image quality with well-preserved overall structures, suitable for input into the model as supplementary images in cases of missing modalities. The supplementary CDFI or B-mode ultrasound images achieve 87.10 % and 86.43 % accuracy, respectively, with AUC values of 0.92 and 0.95. This proves that even in the absence of certain modalities, the generative models can effectively reconstruct missing images, maintaining high classification performance comparable to that in complete modality scenarios.
Conclusions
The generative model-based approach for modality reconstruction significantly improves the robustness of multimodal classification models, particularly in the context of liver tumor diagnosis. This method enhances the clinical applicability of multimodal models by ensuring high diagnostic accuracy despite missing modalities. Future work will explore further improvements in modality reconstruction techniques to increase the generalization and reliability of the model in various clinical settings.
期刊介绍:
To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine.
Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.