L. Jin, Shi Gu, D. Wei, Kaiming Kuang, H. Pfister, Bingbing Ni, Jiancheng Yang, Ming Li
{"title":"RibSeg v2: A Large-scale Benchmark for Rib Labeling and Anatomical Centerline Extraction","authors":"L. Jin, Shi Gu, D. Wei, Kaiming Kuang, H. Pfister, Bingbing Ni, Jiancheng Yang, Ming Li","doi":"10.48550/arXiv.2210.09309","DOIUrl":"https://doi.org/10.48550/arXiv.2210.09309","url":null,"abstract":"Automatic rib labeling and anatomical centerline extraction are common prerequisites for various clinical applications. Prior studies either use in-house datasets that are inaccessible to communities, or focus on rib segmentation that neglects the clinical significance of rib labeling. To address these issues, we extend our prior dataset (RibSeg) on the binary rib segmentation task to a comprehensive benchmark, named RibSeg v2, with 660 CT scans (15,466 individual ribs in total) and annotations manually inspected by experts for rib labeling and anatomical centerline extraction. Based on the RibSeg v2, we develop a pipeline including deep learning-based methods for rib labeling, and a skeletonization-based method for centerline extraction. To improve computational efficiency, we propose a sparse point cloud representation of CT scans and compare it with standard dense voxel grids. Moreover, we design and analyze evaluation metrics to address the key challenges of each task. Our dataset, code, and model are available online to facilitate open research at https://github.com/M3DV/RibSeg.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":" ","pages":""},"PeriodicalIF":10.6,"publicationDate":"2022-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46077902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hongkuan Shi, Zhiwei Wang, Ying Zhou, Dun Li, Xin Yang, Qiang Li
{"title":"Bidirectional Semi-supervised Dual-branch CNN for Robust 3D Reconstruction of Stereo Endoscopic Images via Adaptive Cross and Parallel Supervisions","authors":"Hongkuan Shi, Zhiwei Wang, Ying Zhou, Dun Li, Xin Yang, Qiang Li","doi":"10.48550/arXiv.2210.08291","DOIUrl":"https://doi.org/10.48550/arXiv.2210.08291","url":null,"abstract":"Semi-supervised learning via teacher-student network can train a model effectively on a few labeled samples. It enables a student model to distill knowledge from the teacher's predictions of extra unlabeled data. However, such knowledge flow is typically unidirectional, having the accuracy vulnerable to the quality of teacher model. In this paper, we seek to robust 3D reconstruction of stereo endoscopic images by proposing a novel fashion of bidirectional learning between two learners, each of which can play both roles of teacher and student concurrently. Specifically, we introduce two self-supervisions, i.e., Adaptive Cross Supervision (ACS) and Adaptive Parallel Supervision (APS), to learn a dual-branch convolutional neural network. The two branches predict two different disparity probability distributions for the same position, and output their expectations as disparity values. The learned knowledge flows across branches along two directions: a cross direction (disparity guides distribution in ACS) and a parallel direction (disparity guides disparity in APS). Moreover, each branch also learns confidences to dynamically refine its provided supervisions. In ACS, the predicted disparity is softened into a unimodal distribution, and the lower the confidence, the smoother the distribution. In APS, the incorrect predictions are suppressed by lowering the weights of those with low confidence. With the adaptive bidirectional learning, the two branches enjoy well-tuned mutual supervisions, and eventually converge on a consistent and more accurate disparity estimation. The experimental results on four public datasets demonstrate our superior accuracy over other state-of-the-arts with a relative decrease of averaged disparity error by at least 9.76%.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":" ","pages":""},"PeriodicalIF":10.6,"publicationDate":"2022-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45789439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Dual-Attention Learning Network with Word and Sentence Embedding for Medical Visual Question Answering","authors":"Xiaofei Huang, Hongfang Gong","doi":"10.48550/arXiv.2210.00220","DOIUrl":"https://doi.org/10.48550/arXiv.2210.00220","url":null,"abstract":"Research in medical visual question answering (MVQA) can contribute to the development of computer-aided diagnosis. MVQA is a task that aims to predict accurate and convincing answers based on given medical images and associated natural language questions. This task requires extracting medical knowledge-rich feature content and making fine-grained understandings of them. Therefore, constructing an effective feature extraction and understanding scheme are keys to modeling. Existing MVQA question extraction schemes mainly focus on word information, ignoring medical information in the text, such as medical concepts and domain-specific terms. Meanwhile, some visual and textual feature understanding schemes cannot effectively capture the correlation between regions and keywords for reasonable visual reasoning. In this study, a dual-attention learning network with word and sentence embedding (DALNet-WSE) is proposed. We design a module, transformer with sentence embedding (TSE), to extract a double embedding representation of questions containing keywords and medical information. A dual-attention learning (DAL) module consisting of self-attention and guided attention is proposed to model intensive intramodal and intermodal interactions. With multiple DAL modules (DALs), learning visual and textual co-attention can increase the granularity of understanding and improve visual reasoning. Experimental results on the ImageCLEF 2019 VQA-MED (VQA-MED 2019) and VQA-RAD datasets demonstrate that our proposed method outperforms previous state-of-the-art methods. According to the ablation studies and Grad-CAM maps, DALNet-WSE can extract rich textual information and has strong visual reasoning ability.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"PP 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42795071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Muzaffer Ozbey, S. Dar, H. A. Bedel, Onat Dalmaz, cSaban Ozturk, Alper Gungor, Tolga cCukur
{"title":"Unsupervised Medical Image Translation with Adversarial Diffusion Models","authors":"Muzaffer Ozbey, S. Dar, H. A. Bedel, Onat Dalmaz, cSaban Ozturk, Alper Gungor, Tolga cCukur","doi":"10.48550/arXiv.2207.08208","DOIUrl":"https://doi.org/10.48550/arXiv.2207.08208","url":null,"abstract":"Imputation of missing images via source-to-target modality translation can improve diversity in medical imaging protocols. A pervasive approach for synthesizing target images involves one-shot mapping through generative adversarial networks (GAN). Yet, GAN models that implicitly characterize the image distribution can suffer from limited sample fidelity. Here, we propose a novel method based on adversarial diffusion modeling, SynDiff, for improved performance in medical image translation. To capture a direct correlate of the image distribution, SynDiff leverages a conditional diffusion process that progressively maps noise and source images onto the target image. For fast and accurate image sampling during inference, large diffusion steps are taken with adversarial projections in the reverse diffusion direction. To enable training on unpaired datasets, a cycle-consistent architecture is devised with coupled diffusive and non-diffusive modules that bilaterally translate between two modalities. Extensive assessments are reported on the utility of SynDiff against competing GAN and diffusion models in multi-contrast MRI and MRI-CT translation. Our demonstrations indicate that SynDiff offers quantitatively and qualitatively superior performance against competing baselines.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":" ","pages":""},"PeriodicalIF":10.6,"publicationDate":"2022-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45407622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zihan Li, Yunxiang Li, Qingde Li, You Zhang, Puyang Wang, Dazhou Guo, Le Lu, D. Jin, Qingqi Hong
{"title":"LViT: Language meets Vision Transformer in Medical Image Segmentation","authors":"Zihan Li, Yunxiang Li, Qingde Li, You Zhang, Puyang Wang, Dazhou Guo, Le Lu, D. Jin, Qingqi Hong","doi":"10.48550/arXiv.2206.14718","DOIUrl":"https://doi.org/10.48550/arXiv.2206.14718","url":null,"abstract":"Deep learning has been widely used in medical image segmentation and other aspects. However, the performance of existing medical image segmentation models has been limited by the challenge of obtaining sufficient high-quality labeled data due to the prohibitive data annotation cost. To alleviate this limitation, we propose a new text-augmented medical image segmentation model LViT (Language meets Vision Transformer). In our LViT model, medical text annotation is incorporated to compensate for the quality deficiency in image data. In addition, the text information can guide to generate pseudo labels of improved quality in the semi-supervised learning. We also propose an Exponential Pseudo label Iteration mechanism (EPI) to help the Pixel-Level Attention Module (PLAM) preserve local image features in semi-supervised LViT setting. In our model, LV (Language-Vision) loss is designed to supervise the training of unlabeled images using text information directly. For evaluation, we construct three multimodal medical segmentation datasets (image + text) containing X-rays and CT images. Experimental results show that our proposed LViT has superior segmentation performance in both fully-supervised and semi-supervised setting. The code and datasets are available at https://github.com/HUANGLIZI/LViT.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":" ","pages":""},"PeriodicalIF":10.6,"publicationDate":"2022-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44585888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jun Shi, Yuan-Yang Zhang, Zheng Li, Xiangmin Han, Saisai Ding, Jun Wang, Shihui Ying
{"title":"Pseudo-Data based Self-Supervised Federated Learning for Classification of Histopathological Images","authors":"Jun Shi, Yuan-Yang Zhang, Zheng Li, Xiangmin Han, Saisai Ding, Jun Wang, Shihui Ying","doi":"10.48550/arXiv.2205.15530","DOIUrl":"https://doi.org/10.48550/arXiv.2205.15530","url":null,"abstract":"Computer-aided diagnosis (CAD) can help pathologists improve diagnostic accuracy together with consistency and repeatability for cancers. However, the CAD models trained with the histopathological images only from a single center (hospital) generally suffer from the generalization problem due to the straining inconsistencies among different centers. In this work, we propose a pseudo-data based self-supervised federated learning (FL) framework, named SSL-FT-BT, to improve both the diagnostic accuracy and generalization of CAD models. Specifically, the pseudo histopathological images are generated from each center, which contain both inherent and specific properties corresponding to the real images in this center, but do not include the privacy information. These pseudo images are then shared in the central server for self-supervised learning (SSL) to pre-train the backbone of global mode. A multi-task SSL is then designed to effectively learn both the center-specific information and common inherent representation according to the data characteristics. Moreover, a novel Barlow Twins based FL (FL-BT) algorithm is proposed to improve the local training for the CAD models in each center by conducting model contrastive learning, which benefits the optimization of the global model in the FL procedure. The experimental results on four public histopathological image datasets indicate the effectiveness of the proposed SSL-FL-BT on both diagnostic accuracy and generalization.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":" ","pages":""},"PeriodicalIF":10.6,"publicationDate":"2022-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43105200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yiran Wei, Xi Chen, Lei Zhu, Lipei Zhang, C. Schonlieb, S. Price, C. Li
{"title":"Multi-modal learning for predicting the genotype of glioma","authors":"Yiran Wei, Xi Chen, Lei Zhu, Lipei Zhang, C. Schonlieb, S. Price, C. Li","doi":"10.48550/arXiv.2203.10852","DOIUrl":"https://doi.org/10.48550/arXiv.2203.10852","url":null,"abstract":"The isocitrate dehydrogenase (IDH) gene mutation is an essential biomarker for the diagnosis and prognosis of glioma. It is promising to better predict glioma genotype by integrating focal tumor image and geometric features with brain network features derived from MRI. Convolutional neural networks show reasonable performance in predicting IDH mutation, which, however, cannot learn from non-Euclidean data, e.g., geometric and network data. In this study, we propose a multi-modal learning framework using three separate encoders to extract features of focal tumor image, tumor geometrics and global brain networks. To mitigate the limited availability of diffusion MRI, we develop a self-supervised approach to generate brain networks from anatomical multi-sequence MRI. Moreover, to extract tumor-related features from the brain network, we design a hierarchical attention module for the brain network encoder. Further, we design a bi-level multi-modal contrastive loss to align the multi-modal features and tackle the domain gap at the focal tumor and global brain. Finally, we propose a weighted population graph to integrate the multi-modal features for genotype prediction. Experimental results on the testing set show that the proposed model outperforms the baseline deep learning models. The ablation experiments validate the performance of different components of the framework. The visualized interpretation corresponds to clinical knowledge with further validation. In conclusion, the proposed learning framework provides a novel approach for predicting the genotype of glioma.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":" ","pages":""},"PeriodicalIF":10.6,"publicationDate":"2022-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46754384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Concurrent Ischemic Lesion Age Estimation and Segmentation of CT Brain Using a Transformer-Based Network","authors":"Adam Marcus, Paul Bentley, D. Rueckert","doi":"10.1007/978-3-031-17899-3_6","DOIUrl":"https://doi.org/10.1007/978-3-031-17899-3_6","url":null,"abstract":"","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"35 9","pages":"52-62"},"PeriodicalIF":10.6,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50987497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Allen Lu, Shawn S Ahn, Kevinminh Ta, Nripesh Parajuli, John C Stendahl, Zhao Liu, Nabil E Boutagy, Geng-Shi Jeng, Lawrence H Staib, Matthew O'Donnell, Albert J Sinusas, James S Duncan
{"title":"Learning-Based Regularization for Cardiac Strain Analysis via Domain Adaptation.","authors":"Allen Lu, Shawn S Ahn, Kevinminh Ta, Nripesh Parajuli, John C Stendahl, Zhao Liu, Nabil E Boutagy, Geng-Shi Jeng, Lawrence H Staib, Matthew O'Donnell, Albert J Sinusas, James S Duncan","doi":"10.1109/TMI.2021.3074033","DOIUrl":"https://doi.org/10.1109/TMI.2021.3074033","url":null,"abstract":"Reliable motion estimation and strain analysis using 3D+ time echocardiography (4DE) for localization and characterization of myocardial injury is valuable for early detection and targeted interventions. However, motion estimation is difficult due to the low-SNR that stems from the inherent image properties of 4DE, and intelligent regularization is critical for producing reliable motion estimates. In this work, we incorporated the notion of domain adaptation into a supervised neural network regularization framework. We first propose a semi-supervised Multi-Layered Perceptron (MLP) network with biomechanical constraints for learning a latent representation that is shown to have more physiologically plausible displacements. We extended this framework to include a supervised loss term on synthetic data and showed the effects of biomechanical constraints on the network’s ability for domain adaptation. We validated the semi-supervised regularization method on in vivo data with implanted sonomicrometers. Finally, we showed the ability of our semi-supervised learning regularization approach to identify infarct regions using estimated regional strain maps with good agreement to manually traced infarct regions from postmortem excised hearts.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"40 9","pages":"2233-2245"},"PeriodicalIF":10.6,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TMI.2021.3074033","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9236213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Naohiro Eda, Motofumi Fushimi, Keisuke Hasegawa, T. Nara
{"title":"A Method for Electrical Property Tomography Based on a Three-Dimensional Integral Representation of the Electric Field","authors":"Naohiro Eda, Motofumi Fushimi, Keisuke Hasegawa, T. Nara","doi":"10.36227/techrxiv.15153579.v1","DOIUrl":"https://doi.org/10.36227/techrxiv.15153579.v1","url":null,"abstract":"Magnetic resonance electrical properties tomography (MREPT) noninvasively reconstructs high-resolution electrical property (EP) maps using MRI scanners and is useful for diagnosing cancerous tissues. However, conventional MREPT methods have limitations: sensitivity to noise in the numerical Laplacian operation, difficulty in reconstructing three-dimensional (3D) EPs and convergence not guaranteed in the iterative process. We propose a novel, iterative 3D reconstruction MREPT method without a numerical Laplacian operation. We derive an integral representation of the electric field using its Helmholtz decomposition with Maxwell’s equations, under the assumption that the EPs are known on the boundary of the region of interest with the approximation that the unmeasurable magnetic field components are zero. Then, we solve the simultaneous equations composed of the integral representation and Ampere’s law using a convex projection algorithm whose convergence is theoretically guaranteed. The efficacy of the proposed method was validated through numerical simulations and a phantom experiment. The results showed that this method is effective in reconstructing 3D EPs and is robust to noise. It was also shown that our proposed method with the unmeasurable component $H^{-}$ enhances the accuracy of the EPs in a background and that with all the components of the magnetic field reduces the artifacts at the center of the slices except when all the components of the electric field are close to zero.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"41 1","pages":"1400-1409"},"PeriodicalIF":10.6,"publicationDate":"2021-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45873887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}