IEEE transactions on medical imaging最新文献

筛选
英文 中文
LCGNet: Local Sequential Feature Coupling Global Representation Learning for Functional Connectivity Network Analysis with fMRI. LCGNet:利用 fMRI 进行功能连接网络分析的局部序列特征耦合全局表征学习。
IEEE transactions on medical imaging Pub Date : 2024-07-01 DOI: 10.1109/TMI.2024.3421360
Jie Zhou, Biao Jie, Zhengdong Wang, Zhixiang Zhang, Tongchun Du, Weixin Bian, Yang Yang, Jun Jia
{"title":"LCGNet: Local Sequential Feature Coupling Global Representation Learning for Functional Connectivity Network Analysis with fMRI.","authors":"Jie Zhou, Biao Jie, Zhengdong Wang, Zhixiang Zhang, Tongchun Du, Weixin Bian, Yang Yang, Jun Jia","doi":"10.1109/TMI.2024.3421360","DOIUrl":"https://doi.org/10.1109/TMI.2024.3421360","url":null,"abstract":"<p><p>Analysis of functional connectivity networks (FCNs) derived from resting-state functional magnetic resonance imaging (rs-fMRI) has greatly advanced our understanding of brain diseases, including Alzheimer's disease (AD) and attention deficit hyperactivity disorder (ADHD). Advanced machine learning techniques, such as convolutional neural networks (CNNs), have been used to learn high-level feature representations of FCNs for automated brain disease classification. Even though convolution operations in CNNs are good at extracting local properties of FCNs, they generally cannot well capture global temporal representations of FCNs. Recently, the transformer technique has demonstrated remarkable performance in various tasks, which is attributed to its effective self-attention mechanism in capturing the global temporal feature representations. However, it cannot effectively model the local network characteristics of FCNs. To this end, in this paper, we propose a novel network structure for Local sequential feature Coupling Global representation learning (LCGNet) to take advantage of convolutional operations and self-attention mechanisms for enhanced FCN representation learning. Specifically, we first build a dynamic FCN for each subject using an overlapped sliding window approach. We then construct three sequential components (i.e., edge-to-vertex layer, vertex-to-network layer, and network-to-temporality layer) with a dual backbone branch of CNN and transformer to extract and couple from local to global topological information of brain networks. Experimental results on two real datasets (i.e., ADNI and ADHD-200) with rs-fMRI data show the superiority of our LCGNet.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141478184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual-Domain Collaborative Diffusion Sampling for Multi-Source Stationary Computed Tomography Reconstruction 用于多源静态计算机断层扫描重建的双域协作扩散采样
IEEE transactions on medical imaging Pub Date : 2024-06-28 DOI: 10.1109/TMI.2024.3420411
Zirong Li;Dingyue Chang;Zhenxi Zhang;Fulin Luo;Qiegen Liu;Jianjia Zhang;Guang Yang;Weiwen Wu
{"title":"Dual-Domain Collaborative Diffusion Sampling for Multi-Source Stationary Computed Tomography Reconstruction","authors":"Zirong Li;Dingyue Chang;Zhenxi Zhang;Fulin Luo;Qiegen Liu;Jianjia Zhang;Guang Yang;Weiwen Wu","doi":"10.1109/TMI.2024.3420411","DOIUrl":"10.1109/TMI.2024.3420411","url":null,"abstract":"The multi-source stationary CT, where both the detector and X-ray source are fixed, represents a novel imaging system with high temporal resolution that has garnered significant interest. Limited space within the system restricts the number of X-ray sources, leading to sparse-view CT imaging challenges. Recent diffusion models for reconstructing sparse-view CT have generally focused separately on sinogram or image domains. Sinogram-centric models effectively estimate missing projections but may introduce artifacts, lacking mechanisms to ensure image correctness. Conversely, image-domain models, while capturing detailed image features, often struggle with complex data distribution, leading to inaccuracies in projections. Addressing these issues, the Dual-domain Collaborative Diffusion Sampling (DCDS) model integrates sinogram and image domain diffusion processes for enhanced sparse-view reconstruction. This model combines the strengths of both domains in an optimized mathematical framework. A collaborative diffusion mechanism underpins this model, improving sinogram recovery and image generative capabilities. This mechanism facilitates feedback-driven image generation from the sinogram domain and uses image domain results to complete missing projections. Optimization of the DCDS model is further achieved through the alternative direction iteration method, focusing on data consistency updates. Extensive testing, including numerical simulations, real phantoms, and clinical cardiac datasets, demonstrates the DCDS model’s effectiveness. It consistently outperforms various state-of-the-art benchmarks, delivering exceptional reconstruction quality and precise sinogram.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"43 10","pages":"3398-3411"},"PeriodicalIF":0.0,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141462763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards human-scale magnetic particle imaging: development of the first system with superconductor-based selection coils. 实现人体尺度的磁粉成像:开发首个基于超导体选择线圈的系统。
IEEE transactions on medical imaging Pub Date : 2024-06-26 DOI: 10.1109/TMI.2024.3419427
Tuan-Anh Le, Minh Phu Bui, Yaser Hadadian, Khaled Mohamed Gadelmowla, Seungjun Oh, Chaemin Im, Seungyong Hahn, Jungwon Yoon
{"title":"Towards human-scale magnetic particle imaging: development of the first system with superconductor-based selection coils.","authors":"Tuan-Anh Le, Minh Phu Bui, Yaser Hadadian, Khaled Mohamed Gadelmowla, Seungjun Oh, Chaemin Im, Seungyong Hahn, Jungwon Yoon","doi":"10.1109/TMI.2024.3419427","DOIUrl":"https://doi.org/10.1109/TMI.2024.3419427","url":null,"abstract":"<p><p>Magnetic Particle Imaging (MPI) is an emerging tomographic modality that allows for precise three-dimensional (3D) mapping of magnetic nanoparticles (MNPs) concentration and distribution. Although significant progress has been made towards improving MPI since its introduction, scaling it up for human applications has proven challenging. High-quality images have been obtained in animal-scale MPI scanners with gradients up to 7 T/m/μ<sub>0</sub>, however, for MPI systems with bore diameters around 200 mm the gradients generated by electromagnets drop significantly to below 0.5 T/m/μ<sub>0</sub>. Given the current technological limitations in image reconstruction and the properties of available MNPs, these low gradients inherently impose limitations on improving MPI resolution for higher precision medical imaging. Utilizing superconductors stands out as a promising approach for developing a human-scale MPI system. In this study, we introduce, for the first time, a human-scale amplitude-modulated (AM) MPI system with superconductor-based selection coils. The system achieves an unprecedented magnetic field gradient of up to 2.5 T/m/μ<sub>0</sub> within a 200 mm bore diameter, enabling large fields of view of 100 × 130 × 98 mm<sup>3</sup> at 2.5 T/m/μ<sub>0</sub> for 3D imaging. While obtained spatial resolution is in the order of previous animal-scale AM MPIs, incorporating superconductors for achieving such high gradients in a 200 mm bore diameter marks a major step toward clinical MPI.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141461400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HiCervix: An Extensive Hierarchical Dataset and Benchmark for Cervical Cytology Classification. HiCervix:宫颈细胞学分类的广泛分层数据集和基准。
IEEE transactions on medical imaging Pub Date : 2024-06-26 DOI: 10.1109/TMI.2024.3419697
De Cai, Jie Chen, Junhan Zhao, Yuan Xue, Sen Yang, Wei Yuan, Min Feng, Haiyan Weng, Shuguang Liu, Yulong Peng, Junyou Zhu, Kanran Wang, Christopher Jackson, Hongping Tang, Junzhou Huang, Xiyue Wang
{"title":"HiCervix: An Extensive Hierarchical Dataset and Benchmark for Cervical Cytology Classification.","authors":"De Cai, Jie Chen, Junhan Zhao, Yuan Xue, Sen Yang, Wei Yuan, Min Feng, Haiyan Weng, Shuguang Liu, Yulong Peng, Junyou Zhu, Kanran Wang, Christopher Jackson, Hongping Tang, Junzhou Huang, Xiyue Wang","doi":"10.1109/TMI.2024.3419697","DOIUrl":"https://doi.org/10.1109/TMI.2024.3419697","url":null,"abstract":"<p><p>Cervical cytology is a critical screening strategy for early detection of pre-cancerous and cancerous cervical lesions. The challenge lies in accurately classifying various cervical cytology cell types. Existing automated cervical cytology methods are primarily trained on databases covering a narrow range of coarse-grained cell types, which fail to provide a comprehensive and detailed performance analysis that accurately represents real-world cytopathology conditions. To overcome these limitations, we introduce HiCervix, the most extensive, multi-center cervical cytology dataset currently available to the public. HiCervix includes 40,229 cervical cells from 4,496 whole slide images, categorized into 29 annotated classes. These classes are organized within a three-level hierarchical tree to capture fine-grained subtype information. To exploit the semantic correlation inherent in this hierarchical tree, we propose HierSwin, a hierarchical vision transformer-based classification network. HierSwin serves as a benchmark for detailed feature learning in both coarse-level and fine-level cervical cancer classification tasks. In our comprehensive experiments, HierSwin demonstrated remarkable performance, achieving 92.08% accuracy for coarse-level classification and 82.93% accuracy averaged across all three levels. When compared to board-certified cytopathologists, HierSwin achieved high classification performance (0.8293 versus 0.7359 averaged accuracy), highlighting its potential for clinical applications. This newly released HiCervix dataset, along with our benchmark HierSwin method, is poised to make a substantial impact on the advancement of deep learning algorithms for rapid cervical cancer screening and greatly improve cancer prevention and patient outcomes in real-world clinical settings.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141461398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PtbNet: Based on Local Few-Shot Classes and Small Objects to accurately detect PTB. PtbNet:基于本地少拍类和小物体,准确检测 PTB。
IEEE transactions on medical imaging Pub Date : 2024-06-26 DOI: 10.1109/TMI.2024.3419134
Wenhui Yang, Shuo Gao, Hao Zhang, Hong Yu, Menglei Xu, Puimun Chong, Weijie Zhang, Hong Wang, Wenjuan Zhang, Airong Qian
{"title":"PtbNet: Based on Local Few-Shot Classes and Small Objects to accurately detect PTB.","authors":"Wenhui Yang, Shuo Gao, Hao Zhang, Hong Yu, Menglei Xu, Puimun Chong, Weijie Zhang, Hong Wang, Wenjuan Zhang, Airong Qian","doi":"10.1109/TMI.2024.3419134","DOIUrl":"https://doi.org/10.1109/TMI.2024.3419134","url":null,"abstract":"<p><p>Pulmonary Tuberculosis (PTB) is one of the world's most infectious illnesses, and its early detection is critical for preventing PTB. Digital Radiography (DR) has been the most common and effective technique to examine PTB. However, due to the variety and weak specificity of phenotypes on DR chest X-ray (DCR), it is difficult to make reliable diagnoses for radiologists. Although artificial intelligence technology has made considerable gains in assisting the diagnosis of PTB, it lacks methods to identify the lesions of PTB with few-shot classes and small objects. To solve these problems, geometric data augmentation was used to increase the size of the DCRs. For this purpose, a diffusion probability model was implemented for six few-shot classes. Importantly, we propose a new multi-lesion detector PtbNet based on RetinaNet, which was constructed to detect small objects of PTB lesions. The results showed that by two data augmentations, the number of DCRs increased by 80% from 570 to 2,859. In the pre-evaluation experiments with the baseline, RetinaNet, the AP improved by 9.9 for six few-shot classes. Our extensive empirical evaluation showed that the AP of PtbNet achieved 28.2, outperforming the other 9 state-of-the-art methods. In the ablation study, combined with BiFPN+ and PSPD-Conv, the AP increased by 2.1, AP<sup>s</sup> increased by 5.0, and grew by an average of 9.8 in AP<sup>m</sup> and AP<sup>l</sup>. In summary, PtbNet not only improves the detection of small-object lesions but also enhances the ability to detect different types of PTB uniformly, which helps physicians diagnose PTB lesions accurately. The code is available at https://github.com/Wenhui-person/PtbNet/tree/master.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141461399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accurate Airway Tree Segmentation in CT Scans via Anatomy-aware Multi-class Segmentation and Topology-guided Iterative Learning. 通过解剖学感知的多类分割和拓扑学指导的迭代学习在 CT 扫描中准确分割气道树
IEEE transactions on medical imaging Pub Date : 2024-06-26 DOI: 10.1109/TMI.2024.3419707
Puyang Wang, Dazhou Guo, Dandan Zheng, Minghui Zhang, Haogang Yu, Xin Sun, Jia Ge, Yun Gu, Le Lu, Xianghua Ye, Dakai Jin
{"title":"Accurate Airway Tree Segmentation in CT Scans via Anatomy-aware Multi-class Segmentation and Topology-guided Iterative Learning.","authors":"Puyang Wang, Dazhou Guo, Dandan Zheng, Minghui Zhang, Haogang Yu, Xin Sun, Jia Ge, Yun Gu, Le Lu, Xianghua Ye, Dakai Jin","doi":"10.1109/TMI.2024.3419707","DOIUrl":"https://doi.org/10.1109/TMI.2024.3419707","url":null,"abstract":"<p><p>Intrathoracic airway segmentation in computed tomography is a prerequisite for various respiratory disease analyses such as chronic obstructive pulmonary disease, asthma and lung cancer. Due to the low imaging contrast and noises execrated at peripheral branches, the topological-complexity and the intra-class imbalance of airway tree, it remains challenging for deep learning-based methods to segment the complete airway tree (on extracting deeper branches). Unlike other organs with simpler shapes or topology, the airway's complex tree structure imposes an unbearable burden to generate the \"ground truth\" label (up to 7 or 3 hours of manual or semi-automatic annotation per case). Most of the existing airway datasets are incompletely labeled/annotated, thus limiting the completeness of computer-segmented airway. In this paper, we propose a new anatomy-aware multi-class airway segmentation method enhanced by topology-guided iterative self-learning. Based on the natural airway anatomy, we formulate a simple yet highly effective anatomy-aware multi-class segmentation task to intuitively handle the severe intra-class imbalance of the airway. To solve the incomplete labeling issue, we propose a tailored iterative self-learning scheme to segment toward the complete airway tree. For generating pseudo-labels to achieve higher sensitivity (while retaining similar specificity), we introduce a novel breakage attention map and design a topology-guided pseudo-label refinement method by iteratively connecting breaking branches commonly existed from initial pseudo-labels. Extensive experiments have been conducted on four datasets including two public challenges. The proposed method achieves the top performance in both EXACT'09 challenge using average score and ATM'22 challenge on weighted average score. In a public BAS dataset and a private lung cancer dataset, our method significantly improves previous leading approaches by extracting at least (absolute) 6.1% more detected tree length and 5.2% more tree branches, while maintaining comparable precision.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141461397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Temporal Dynamic Synchronous Functional Brain Network for Schizophrenia Classification and Lateralization Analysis. 用于精神分裂症分类和侧化分析的时态动态同步功能脑网络
IEEE transactions on medical imaging Pub Date : 2024-06-25 DOI: 10.1109/TMI.2024.3419041
Cheng Zhu, Ying Tan, Shuqi Yang, Jiaqing Miao, Jiayi Zhu, Huan Huang, Dezhong Yao, Cheng Luo
{"title":"Temporal Dynamic Synchronous Functional Brain Network for Schizophrenia Classification and Lateralization Analysis.","authors":"Cheng Zhu, Ying Tan, Shuqi Yang, Jiaqing Miao, Jiayi Zhu, Huan Huang, Dezhong Yao, Cheng Luo","doi":"10.1109/TMI.2024.3419041","DOIUrl":"10.1109/TMI.2024.3419041","url":null,"abstract":"<p><p>Available evidence suggests that dynamic functional connectivity can capture time-varying abnormalities in brain activity in resting-state cerebral functional magnetic resonance imaging (rs-fMRI) data and has a natural advantage in uncovering mechanisms of abnormal brain activity in schizophrenia (SZ) patients. Hence, an advanced dynamic brain network analysis model called the temporal brain category graph convolutional network (Temporal-BCGCN) was employed. Firstly, a unique dynamic brain network analysis module, DSF-BrainNet, was designed to construct dynamic synchronization features. Subsequently, a revolutionary graph convolution method, TemporalConv, was proposed based on the synchronous temporal properties of features. Finally, the first modular test tool for abnormal hemispherical lateralization in deep learning based on rs-fMRI data, named CategoryPool, was proposed. This study was validated on COBRE and UCLA datasets and achieved 83.62% and 89.71% average accuracies, respectively, outperforming the baseline model and other state-of-theart methods. The ablation results also demonstrate the advantages of TemporalConv over the traditional edge feature graph convolution approach and the improvement of CategoryPool over the classical graph pooling approach. Interestingly, this study showed that the lower-order perceptual system and higher-order network regions in the left hemisphere are more severely dysfunctional than in the right hemisphere in SZ, reaffirmings the importance of the left medial superior frontal gyrus in SZ. Our code was available at: https://github.com/swfen/Temporal-BCGCN.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141452465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Blind CT Image Quality Assessment Using DDPM-Derived Content and Transformer-Based Evaluator 使用 DDPM 派生内容和基于变换器的评估器进行盲 CT 图像质量评估。
IEEE transactions on medical imaging Pub Date : 2024-06-24 DOI: 10.1109/TMI.2024.3418652
Yongyi Shi;Wenjun Xia;Ge Wang;Xuanqin Mou
{"title":"Blind CT Image Quality Assessment Using DDPM-Derived Content and Transformer-Based Evaluator","authors":"Yongyi Shi;Wenjun Xia;Ge Wang;Xuanqin Mou","doi":"10.1109/TMI.2024.3418652","DOIUrl":"10.1109/TMI.2024.3418652","url":null,"abstract":"Lowering radiation dose per view and utilizing sparse views per scan are two common CT scan modes, albeit often leading to distorted images characterized by noise and streak artifacts. Blind image quality assessment (BIQA) strives to evaluate perceptual quality in alignment with what radiologists perceive, which plays an important role in advancing low-dose CT reconstruction techniques. An intriguing direction involves developing BIQA methods that mimic the operational characteristic of the human visual system (HVS). The internal generative mechanism (IGM) theory reveals that the HVS actively deduces primary content to enhance comprehension. In this study, we introduce an innovative BIQA metric that emulates the active inference process of IGM. Initially, an active inference module, implemented as a denoising diffusion probabilistic model (DDPM), is constructed to anticipate the primary content. Then, the dissimilarity map is derived by assessing the interrelation between the distorted image and its primary content. Subsequently, the distorted image and dissimilarity map are combined into a multi-channel image, which is inputted into a transformer-based image quality evaluator. By leveraging the DDPM-derived primary content, our approach achieves competitive performance on a low-dose CT dataset.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"43 10","pages":"3559-3569"},"PeriodicalIF":0.0,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141447882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MCPL: Multi-modal Collaborative Prompt Learning for Medical Vision-Language Model. MCPL:医学视觉语言模型的多模式协作提示学习。
IEEE transactions on medical imaging Pub Date : 2024-06-24 DOI: 10.1109/TMI.2024.3418408
Pengyu Wang, Huaqi Zhang, Yixuan Yuan
{"title":"MCPL: Multi-modal Collaborative Prompt Learning for Medical Vision-Language Model.","authors":"Pengyu Wang, Huaqi Zhang, Yixuan Yuan","doi":"10.1109/TMI.2024.3418408","DOIUrl":"10.1109/TMI.2024.3418408","url":null,"abstract":"<p><p>Multi-modal prompt learning is a high-performance and cost-effective learning paradigm, which learns text as well as image prompts to tune pre-trained vision-language (V-L) models like CLIP for adapting multiple downstream tasks. However, recent methods typically treat text and image prompts as independent components without considering the dependency between prompts. Moreover, extending multi-modal prompt learning into the medical field poses challenges due to a significant gap between general- and medical-domain data. To this end, we propose a Multi-modal Collaborative Prompt Learning (MCPL) pipeline to tune a frozen V-L model for aligning medical text-image representations, thereby achieving medical downstream tasks. We first construct the anatomy-pathology (AP) prompt for multi-modal prompting jointly with text and image prompts. The AP prompt introduces instance-level anatomy and pathology information, thereby making a V-L model better comprehend medical reports and images. Next, we propose graph-guided prompt collaboration module (GPCM), which explicitly establishes multi-way couplings between the AP, text, and image prompts, enabling collaborative multi-modal prompt producing and updating for more effective prompting. Finally, we develop a novel prompt configuration scheme, which attaches the AP prompt to the query and key, and the text/image prompt to the value in self-attention layers for improving the interpretability of multi-modal prompts. Extensive experiments on numerous medical classification and object detection datasets show that the proposed pipeline achieves excellent effectiveness and generalization. Compared with state-of-the-art prompt learning methods, MCPL provides a more reliable multi-modal prompt paradigm for reducing tuning costs of V-L models on medical downstream tasks. Our code: https://github.com/CUHK-AIM-Group/MCPL.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141447883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Time-Reversion Fast-Sampling Score-Based Model for Limited-Angle CT Reconstruction 用于有限角度 CT 重建的基于时间转换快速采样分数的模型
IEEE transactions on medical imaging Pub Date : 2024-06-24 DOI: 10.1109/TMI.2024.3418838
Yanyang Wang;Zirong Li;Weiwen Wu
{"title":"Time-Reversion Fast-Sampling Score-Based Model for Limited-Angle CT Reconstruction","authors":"Yanyang Wang;Zirong Li;Weiwen Wu","doi":"10.1109/TMI.2024.3418838","DOIUrl":"10.1109/TMI.2024.3418838","url":null,"abstract":"The score-based generative model (SGM) has received significant attention in the field of medical imaging, particularly in the context of limited-angle computed tomography (LACT). Traditional SGM approaches achieved robust reconstruction performance by incorporating a substantial number of sampling steps during the inference phase. However, these established SGM-based methods require large computational cost to reconstruct one case. The main challenge lies in achieving high-quality images with rapid sampling while preserving sharp edges and small features. In this study, we propose an innovative rapid-sampling strategy for SGM, which we have aptly named the time-reversion fast-sampling (TIFA) score-based model for LACT reconstruction. The entire sampling procedure adheres steadfastly to the principles of robust optimization theory and is firmly grounded in a comprehensive mathematical model. TIFA’s rapid-sampling mechanism comprises several essential components, including jump sampling, time-reversion with re-sampling, and compressed sampling. In the initial jump sampling stage, multiple sampling steps are bypassed to expedite the attainment of preliminary results. Subsequently, during the time-reversion process, the initial results undergo controlled corruption by introducing small-scale noise. The re-sampling process then diligently refines the initially corrupted results. Finally, compressed sampling fine-tunes the refinement outcomes by imposing regularization term. Quantitative and qualitative assessments conducted on numerical simulations, real physical phantom, and clinical cardiac datasets, unequivocally demonstrate that TIFA method (using 200 steps) outperforms other state-of-the-art methods (using 2000 steps) from available [0°, 90°] and [0°, 60°]. Furthermore, experimental results underscore that our TIFA method continues to reconstruct high-quality images even with 10 steps. Our code at \u0000<uri>https://github.com/tianzhijiaoziA/TIFADiffusion</uri>\u0000.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"43 10","pages":"3449-3460"},"PeriodicalIF":0.0,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141447884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信