Dianye Huang;Chenyang Li;Angelos Karlas;Xiangyu Chu;K. W. Samuel Au;Nassir Navab;Zhongliang Jiang
{"title":"VibNet: Vibration-Boosted Needle Detection in Ultrasound Images","authors":"Dianye Huang;Chenyang Li;Angelos Karlas;Xiangyu Chu;K. W. Samuel Au;Nassir Navab;Zhongliang Jiang","doi":"10.1109/TMI.2025.3545434","DOIUrl":"10.1109/TMI.2025.3545434","url":null,"abstract":"Precise percutaneous needle detection is crucial for ultrasound (US)-guided interventions. However, inherent limitations such as speckles, needle-like artifacts, and low resolution make it challenging to robustly detect needles, especially when their visibility is reduced or imperceptible. To address this challenge, we propose VibNet, a learning-based framework designed to enhance the robustness and accuracy of needle detection in US images by leveraging periodic vibration applied externally to the needle shafts. VibNet integrates neural Short-Time Fourier Transform and Hough Transform modules to achieve successive sub-goals, including motion feature extraction in the spatiotemporal space, frequency feature aggregation, and needle detection in the Hough space. Due to the periodic subtle vibration, the features are more robust in the frequency domain than in the image intensity domain, making VibNet more effective than traditional intensity-based methods. To demonstrate the effectiveness of VibNet, we conducted experiments on distinct ex vivo porcine and bovine tissue samples. The results obtained on porcine samples demonstrate that VibNet effectively detects needles even when their visibility is severely reduced, with a tip error of <inline-formula> <tex-math>${1}.{61}pm {1}.{56}~textit {mm}$ </tex-math></inline-formula> compared to <inline-formula> <tex-math>${8}.{15}pm {9}.{98}~textit {mm}$ </tex-math></inline-formula> for UNet and <inline-formula> <tex-math>${6}.{63}pm {7}.{58}~textit {mm}$ </tex-math></inline-formula> for WNet, and a needle direction error of <inline-formula> <tex-math>${1}.{64}pm {1}.{86}^{circ }$ </tex-math></inline-formula> compared to <inline-formula> <tex-math>${9}.{29}~pm ~{15}.{30}^{circ }$ </tex-math></inline-formula> for UNet and <inline-formula> <tex-math>${8}.{54}~pm ~{17}.{92}^{circ }$ </tex-math></inline-formula> for WNet. Code: <uri>https://github.com/marslicy/VibNet</uri>.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 6","pages":"2696-2708"},"PeriodicalIF":0.0,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10902567","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143495369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Foundation Model-Guided Gaussian Splatting for 4D Reconstruction of Deformable Tissues","authors":"Yifan Liu;Chenxin Li;Hengyu Liu;Chen Yang;Yixuan Yuan","doi":"10.1109/TMI.2025.3545183","DOIUrl":"10.1109/TMI.2025.3545183","url":null,"abstract":"Reconstructing deformable anatomical structures from endoscopic videos is a pivotal and promising research topic that can enable advanced surgical applications and improve patient outcomes. While existing surgical scene reconstruction methods have made notable progress, they often suffer from slow rendering speeds due to using neural radiance fields, limiting their practical viability in real-world applications. To overcome this bottleneck, we propose EndoGaussian, a framework that integrates the strengths of 3D Gaussian Splatting representations, allowing for high-fidelity tissue reconstruction, efficient training, and real-time rendering. Specifically, we dedicate a Foundation Model-driven Initialization (FMI) module, which distills 3D cues from multiple vision foundation models (VFMs) to swiftly construct the preliminary scene structure for Gaussian initialization. Then, a Spatio-temporal Gaussian Tracking (SGT) is designed, efficiently modeling scene dynamics using the multi-scale HexPlane with spatio-temporal priors. Furthermore, to improve the dynamics modeling ability for scenes with large deformation, EndoGaussian integrates Motion-aware Frame Synthesis (MFS) to adaptively synthesize new frames as extra training constraints. Experimental results on public datasets demonstrate EndoGaussian’s efficacy against prior state-of-the-art methods, including superior rendering speed (168 FPS, real-time), enhanced rendering quality (38.555 PSNR), and reduced training overhead (within 2 min/scene). These results underscore EndoGaussian’s potential to significantly advance intraoperative surgery applications, paving the way for more accurate and efficient real-time surgical guidance and decision-making in clinical scenarios. Code is available at: <uri>https://github.com/CUHK-AIM-Group/EndoGaussian</uri>.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 6","pages":"2672-2682"},"PeriodicalIF":0.0,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143495368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hongdian Sun;Xiao Liang;Linyang Li;Yuanlong Zhao;Heng Guo;Weizhi Qi;Lei Xi
{"title":"Dual-Scanning Photoacoustic Endomicroscopy for High-Speed Gastrointestinal Microvascular Imaging","authors":"Hongdian Sun;Xiao Liang;Linyang Li;Yuanlong Zhao;Heng Guo;Weizhi Qi;Lei Xi","doi":"10.1109/TMI.2025.3544403","DOIUrl":"10.1109/TMI.2025.3544403","url":null,"abstract":"Photoacoustic endomicroscopy enables high-resolution imaging of deep microvasculature within the gastrointestinal wall using modulated laser pulses with point-by-point scanning. However, conventional scanning mechanisms frequently encounter difficulties in balancing imaging speed and field of view, particularly when imaging the peristaltic gastrointestinal tract. To address this challenge, we propose a dual-scanning photoacoustic endomicroscopy with an adjustable focal plane and an ultrafast imaging speed. The probe features two distinct scanning modes: 360° angular scanning providing a wide field of view, and regional spiral scanning offering high image quality. We demonstrated the capability of this probe through imaging both phantoms and rat rectums. The results from the rectal injury model demonstrate the applicability and sensitivity of the probe. Overall, this study offers new perspectives for expanding the applications and clinical potential of photoacoustic endomicroscopy.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 6","pages":"2709-2717"},"PeriodicalIF":0.0,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143470664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Organ-DETR: Organ Detection via Transformers","authors":"Morteza Ghahremani;Benjamin Raphael Ernhofer;Jiajun Wang;Marcus Makowski;Christian Wachinger","doi":"10.1109/TMI.2025.3543581","DOIUrl":"10.1109/TMI.2025.3543581","url":null,"abstract":"Query-based Transformers have been yielding impressive performance in object localization and detection tasks. However, their application to organ detection in 3D medical imaging data has been relatively unexplored. This study introduces Organ-DETR, featuring two innovative modules, MultiScale Attention (MSA) and Dense Query Matching (DQM), designed to enhance the performance of Detection Transformers (DETRs) for 3D organ detection. MSA is a novel top-down representation learning approach for efficiently encoding Computed Tomography (CT) features. This architecture employs a multiscale attention mechanism, utilizing both dual self-attention and cross-scale attention mechanisms to extract intra- and inter-scale spatial interactions in the attention mechanism. Organ-DETR also introduces DQM, an approach for one-to-many matching that tackles the label assignment difficulties in organ detection. DQM increases positive queries to enhance both recall scores and training efficiency without the need for additional learnable parameters. Extensive results on five 3D CT datasets indicate that the proposed Organ-DETR outperforms comparable techniques by achieving a remarkable improvement of +10.6 mAP COCO. The project and code are available at <uri>https://github.com/ai-med/OrganDETR</uri>.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 6","pages":"2657-2671"},"PeriodicalIF":0.0,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10892276","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143452341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Editorial Flagship Toward the Future","authors":"Ge Wang","doi":"10.1109/TMI.2025.3543049","DOIUrl":"10.1109/TMI.2025.3543049","url":null,"abstract":"This editorial presents the vision and strategic direction of IEEE Transactions on Medical Imaging (TMI) under new leadership. Key points include restructuring the editorial board to enhance efficiency and diversity, streamlining the peer review process to improve decision quality and speed, and launching the <italic>AI for TMI</i> (AI4TMI) initiative to integrate AI in journal management. Through these efforts, TMI aims to sustain excellence, adapt to emerging trends, and shape the future of medical imaging research.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 3","pages":"1113-1114"},"PeriodicalIF":0.0,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10891575","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143443166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cross- and Intra-Image Prototypical Learning for Multi-Label Disease Diagnosis and Interpretation","authors":"Chong Wang;Fengbei Liu;Yuanhong Chen;Helen Frazer;Gustavo Carneiro","doi":"10.1109/TMI.2025.3541830","DOIUrl":"10.1109/TMI.2025.3541830","url":null,"abstract":"Recent advances in prototypical learning have shown remarkable potential to provide useful decision interpretations associating activation maps and predictions with class-specific training prototypes. Such prototypical learning has been well-studied for various single-label diseases, but for quite relevant and more challenging multi-label diagnosis, where multiple diseases are often concurrent within an image, existing prototypical learning models struggle to obtain meaningful activation maps and effective class prototypes due to the entanglement of the multiple diseases. In this paper, we present a novel Cross- and Intra-image Prototypical Learning (CIPL) framework, for accurate multi-label disease diagnosis and interpretation from medical images. CIPL takes advantage of common cross-image semantics to disentangle the multiple diseases when learning the prototypes, allowing a comprehensive understanding of complicated pathological lesions. Furthermore, we propose a new two-level alignment-based regularisation strategy that effectively leverages consistent intra-image information to enhance interpretation robustness and predictive performance. Extensive experiments show that our CIPL attains the state-of-the-art (SOTA) classification accuracy in two public multi-label benchmarks of disease diagnosis: thoracic radiography and fundus images. Quantitative interpretability results show that CIPL also has superiority in weakly-supervised thoracic disease localisation over other leading saliency- and prototype-based explanation methods.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 6","pages":"2568-2580"},"PeriodicalIF":0.0,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143417404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TransMatch: Employing Bridging Strategy to Overcome Large Deformation for Feature Matching in Gastroscopy Scenario","authors":"Guosong Zhu;Zhen Qin;Linfang Yu;Yi Ding;Zhiguang Qin","doi":"10.1109/TMI.2025.3541433","DOIUrl":"10.1109/TMI.2025.3541433","url":null,"abstract":"Feature matching is widely applied in the image processing field. However, both traditional feature matching methods and previous deep learning-based methods struggle to accurately match the features with severe deformations and large displacements, particularly in gastroscopy scenario. To fill this gap, an effective feature matching framework named TransMatch is proposed, which addresses the largely displacements issue by matching features with global information leveraged via Transformer structure. To address the severe deformation of features, an effective bridging strategy with a novel bidirectional quadratic interpolation network is employed. This bridging strategy decomposes and simplifies the matching of features undergoing severe deformations. A deblurring module for gastroscopy scenario is specifically designed to address the potential blurriness. Experiments have illustrated that proposed method achieves state-of-the-art performance of feature matching and frame interpolation in gastroscopy scenario. Moreover, a large-scale gastroscopy dataset is also constructed for multiple tasks.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 6","pages":"2643-2656"},"PeriodicalIF":0.0,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143417410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Method for Correcting the Muscle Fiber Orientation Determined by a T-Shaped Transducer in Ultrasound Shear Wave Elastography","authors":"Chien Chen;Guo-Xuan Xu;Wei-Ren Su;Chih-Chung Huang","doi":"10.1109/TMI.2025.3541321","DOIUrl":"10.1109/TMI.2025.3541321","url":null,"abstract":"Shear wave elastography (SWE) is a quantitative imaging method that could be used for clinical assessment of musculoskeletal stiffness, particularly in disease diagnosis and rehabilitation evaluation. However, the elastic anisotropy of skeletal muscle leads to uncertainties in shear wave velocity (SWV) measurements in SWE because the SWV varies with muscle fiber orientation. Therefore, many studies have conducted 360° rotational measurements of SWV to determine the elastic anisotropy of muscle; however, the extended data acquisition time of this approach limits its clinical utility. In this study, a T-shaped transducer was used for rapidly measuring the longitudinal and transverse SWVs (<inline-formula> <tex-math>$textit {SWV}_{L}$ </tex-math></inline-formula> and <inline-formula> <tex-math>$textit {SWV}_{T}$ </tex-math></inline-formula>) of muscle through an ellipse fitting method to estimate the fiber orientation angle when the excitation is normal to the material axis. The performance of this approach was examined by conducting a homogeneous elastic phantom experiment, which indicated that the proposed T-shaped transducer generated shear waves in three directions by applying a supersonic push at the junction of the transducer. The error between the measured SWVs and ground truth was approximately 6.5%. The proposed T-shaped transducer was also used to measure the SWV in the biceps brachii of four healthy individuals. The <inline-formula> <tex-math>$textit {SWV}_{L}$ </tex-math></inline-formula> and <inline-formula> <tex-math>$textit {SWV}_{T}$ </tex-math></inline-formula> values measured with this transducer were 2.47 and 1.09 m/s, respectively, which were consistent with the SWVs obtained under 360° rotation and in the literature (an error of ~4%). All experimental results were consistent with the results obtained under 360° rotation, which indicates that the proposed method enables the rapid and stable estimation of muscle fiber orientation in SWE.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 6","pages":"2528-2540"},"PeriodicalIF":0.0,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143417409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CT-SDM: A Sampling Diffusion Model for Sparse-View CT Reconstruction Across Various Sampling Rates","authors":"Liutao Yang;Jiahao Huang;Guang Yang;Daoqiang Zhang","doi":"10.1109/TMI.2025.3541491","DOIUrl":"10.1109/TMI.2025.3541491","url":null,"abstract":"Sparse views X-ray computed tomography has emerged as a contemporary technique to mitigate radiation dose. Because of the reduced number of projection views, traditional reconstruction methods can lead to severe artifacts. Recently, research studies utilizing deep learning methods has made promising progress in removing artifacts for Sparse-View Computed Tomography (SVCT). However, given the limitations on the generalization capability of deep learning models, current methods usually train models on fixed sampling rates, affecting the usability and flexibility of model deployment in real clinical settings. To address this issue, our study proposes a adaptive reconstruction method to achieve high-performance SVCT reconstruction at various sampling rate. Specifically, we design a novel imaging degradation operator in the proposed sampling diffusion model for SVCT (CT-SDM) to simulate the projection process in the sinogram domain. Thus, the CT-SDM can gradually add projection views to highly undersampled measurements to generalize the full-view sinograms. By choosing an appropriate starting point in diffusion inference, the proposed model can recover the full-view sinograms from various sampling rate with only one trained model. Experiments on several datasets have verified the effectiveness and robustness of our approach, demonstrating its superiority in reconstructing high-quality images from sparse-view CT scans across various sampling rates.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 6","pages":"2581-2593"},"PeriodicalIF":0.0,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143417407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yi Zhou;Thiara Sana Ahmed;Meng Wang;Eric A. Newman;Leopold Schmetterer;Huazhu Fu;Jun Cheng;Bingyao Tan
{"title":"Masked Vascular Structure Segmentation and Completion in Retinal Images","authors":"Yi Zhou;Thiara Sana Ahmed;Meng Wang;Eric A. Newman;Leopold Schmetterer;Huazhu Fu;Jun Cheng;Bingyao Tan","doi":"10.1109/TMI.2025.3538336","DOIUrl":"10.1109/TMI.2025.3538336","url":null,"abstract":"Early retinal vascular changes in diseases such as diabetic retinopathy often occur at a microscopic level. Accurate evaluation of retinal vascular networks at a micro-level could significantly improve our understanding of angiopathology and potentially aid ophthalmologists in disease assessment and management. Multiple angiogram-related retinal imaging modalities, including fundus, optical coherence tomography angiography, and fluorescence angiography, project continuous, inter-connected retinal microvascular networks into imaging domains. However, extracting the microvascular network, which includes arterioles, venules, and capillaries, is challenging due to the limited contrast and resolution. As a result, the vascular network often appears as fragmented segments. In this paper, we propose a backbone-agnostic Masked Vascular Structure Segmentation and Completion (MaskVSC) method to reconstruct the retinal vascular network. MaskVSC simulates missing sections of blood vessels and uses this simulation to train the model to predict the missing parts and their connections. This approach simulates highly heterogeneous forms of vessel breaks and mitigates the need for massive data labeling. Accordingly, we introduce a connectivity loss function that penalizes interruptions in the vascular network. Our findings show that masking 40% of the segments yields optimal performance in reconstructing the interconnected vascular network. We test our method on three different types of retinal images across five separate datasets. The results demonstrate that MaskVSC outperforms state-of-the-art methods in maintaining vascular network completeness and segmentation accuracy. Furthermore, MaskVSC has been introduced to different segmentation backbones and has successfully improved performance. The code and 2PFM data are available at: <uri>https://github.com/Zhouyi-Zura/MaskVSC</uri>.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 6","pages":"2492-2503"},"PeriodicalIF":0.0,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10887048","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143417412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}