Haoran Wang , Qiuye Jin , Xiaofei Du , Liu Wang , Qinhao Guo , Haiming Li , Manning Wang , Zhijian Song
{"title":"MDAL: Modality-difference-based active learning for multimodal medical image analysis via contrastive learning and pointwise mutual information","authors":"Haoran Wang , Qiuye Jin , Xiaofei Du , Liu Wang , Qinhao Guo , Haiming Li , Manning Wang , Zhijian Song","doi":"10.1016/j.compmedimag.2025.102544","DOIUrl":null,"url":null,"abstract":"<div><div>Multimodal medical images reveal different characteristics of the same anatomy or lesion, offering significant clinical value. Deep learning has achieved widespread success in medical image analysis with large-scale labeled datasets. However, annotating medical images is expensive and labor-intensive for doctors, and the variations between different modalities further increase the annotation cost for multimodal images. This study aims to minimize the annotation cost for multimodal medical image analysis. We proposes a novel active learning framework MDAL based on modality differences for multimodal medical images. MDAL quantifies the sample-wise modality differences through pointwise mutual information estimated by multimodal contrastive learning. We hypothesize that samples with larger modality differences are more informative for annotation and further propose two sampling strategies based on these differences: MaxMD and DiverseMD. Moreover, MDAL could select informative samples in one shot without initial labeled data. We evaluated MDAL on public brain glioma and meningioma segmentation datasets and an in-house ovarian cancer classification dataset. MDAL outperforms other advanced active learning competitors. Besides, when using only 20%, 20%, and 15% of labeled samples in these datasets, MDAL reaches 99.6%, 99.9%, and 99.3% of the performance of supervised training with full labeled dataset, respectively. The results show that our proposed MDAL could significantly reduce the annotation cost for multimodal medical image analysis. We expect MDAL could be further extended to other multimodal medical data for lower annotation costs.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"123 ","pages":"Article 102544"},"PeriodicalIF":5.4000,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computerized Medical Imaging and Graphics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0895611125000539","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Multimodal medical images reveal different characteristics of the same anatomy or lesion, offering significant clinical value. Deep learning has achieved widespread success in medical image analysis with large-scale labeled datasets. However, annotating medical images is expensive and labor-intensive for doctors, and the variations between different modalities further increase the annotation cost for multimodal images. This study aims to minimize the annotation cost for multimodal medical image analysis. We proposes a novel active learning framework MDAL based on modality differences for multimodal medical images. MDAL quantifies the sample-wise modality differences through pointwise mutual information estimated by multimodal contrastive learning. We hypothesize that samples with larger modality differences are more informative for annotation and further propose two sampling strategies based on these differences: MaxMD and DiverseMD. Moreover, MDAL could select informative samples in one shot without initial labeled data. We evaluated MDAL on public brain glioma and meningioma segmentation datasets and an in-house ovarian cancer classification dataset. MDAL outperforms other advanced active learning competitors. Besides, when using only 20%, 20%, and 15% of labeled samples in these datasets, MDAL reaches 99.6%, 99.9%, and 99.3% of the performance of supervised training with full labeled dataset, respectively. The results show that our proposed MDAL could significantly reduce the annotation cost for multimodal medical image analysis. We expect MDAL could be further extended to other multimodal medical data for lower annotation costs.
期刊介绍:
The purpose of the journal Computerized Medical Imaging and Graphics is to act as a source for the exchange of research results concerning algorithmic advances, development, and application of digital imaging in disease detection, diagnosis, intervention, prevention, precision medicine, and population health. Included in the journal will be articles on novel computerized imaging or visualization techniques, including artificial intelligence and machine learning, augmented reality for surgical planning and guidance, big biomedical data visualization, computer-aided diagnosis, computerized-robotic surgery, image-guided therapy, imaging scanning and reconstruction, mobile and tele-imaging, radiomics, and imaging integration and modeling with other information relevant to digital health. The types of biomedical imaging include: magnetic resonance, computed tomography, ultrasound, nuclear medicine, X-ray, microwave, optical and multi-photon microscopy, video and sensory imaging, and the convergence of biomedical images with other non-imaging datasets.