International Journal of Imaging Systems and Technology最新文献

筛选
英文 中文
Dual-Branch Multimodal Attention Fusion Networks for Electrical Impedance and Microwave Dual-Mode Tomography of Stroke 脑卒中电阻抗双分支多模态注意力融合网络及微波双模断层扫描
IF 2.5 4区 计算机科学
International Journal of Imaging Systems and Technology Pub Date : 2026-03-27 DOI: 10.1002/ima.70337
Jinzhen Liu, Xiangqian Meng, Hui Xiong
{"title":"Dual-Branch Multimodal Attention Fusion Networks for Electrical Impedance and Microwave Dual-Mode Tomography of Stroke","authors":"Jinzhen Liu,&nbsp;Xiangqian Meng,&nbsp;Hui Xiong","doi":"10.1002/ima.70337","DOIUrl":"https://doi.org/10.1002/ima.70337","url":null,"abstract":"<div>\u0000 \u0000 <p>Electrical impedance tomography (EIT) and microwave tomography (MWT), as two emerging noninvasive imaging techniques, have been widely used in stroke diagnosis. However, single-modality imaging exhibits inherent limitations such as insufficient information and low resolution, which pose challenges in meeting the requirements of clinical diagnosis. To enhance the resolution of stroke imaging, a dual-branch multimodal attention fusion network (DMAFusion) for electrical impedance and microwave dual-mode tomography (EI/MDT) is proposed. The network employs a dual-branch interactive encoder module to train the encoders separately for different modalities while enabling cross-modal information fusion. With an improved efficient multiscale attention module, the module enhances feature extraction capabilities. An attentional feature fusion strategy is applied to deeply fuse features from different modalities, obtaining more comprehensive and accurate information. Through comparative and robustness experiments, the results demonstrate that DMAFusion outperforms single-modality imaging methods, achieving higher-resolution reconstructed images and improved robustness. Additionally, compared to other multimodal imaging networks, the proposed method significantly enhances image quality and reconstruction accuracy, further validating the effectiveness of EI/MDT and the superiority of DMAFusion in multimodal imaging. Therefore, through multimodal information fusion, the network provides a new technical means for clinical application in noninvasive and precise stroke diagnosis.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 3","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147615394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Multiscale Intracerebral Hemorrhage Segmentation Framework Based on Wavelet Convolution and Feature Compensation Mechanism 基于小波卷积和特征补偿机制的多尺度脑出血分割框架
IF 2.5 4区 计算机科学
International Journal of Imaging Systems and Technology Pub Date : 2026-03-27 DOI: 10.1002/ima.70345
Wenyu Wang, Huiyun Long, Fangfang Gou, Guangqian Kong, Xun Duan
{"title":"A Multiscale Intracerebral Hemorrhage Segmentation Framework Based on Wavelet Convolution and Feature Compensation Mechanism","authors":"Wenyu Wang,&nbsp;Huiyun Long,&nbsp;Fangfang Gou,&nbsp;Guangqian Kong,&nbsp;Xun Duan","doi":"10.1002/ima.70345","DOIUrl":"https://doi.org/10.1002/ima.70345","url":null,"abstract":"<div>\u0000 \u0000 <p>Artificial intelligence–assisted diagnostic technologies are increasingly applied in the medical field, particularly for the diagnosis of intracerebral hemorrhage (ICH), a cerebrovascular disease with high mortality and disability rates. Although CT imaging is the primary modality for ICH diagnosis, accurate and rapid lesion segmentation remains challenging because hemorrhagic lesions are small, spatially variable, and heterogeneous in appearance, and conventional models are limited in detecting small lesions due to their small receptive fields and feature loss. To address these challenges, we propose WRes-UNet (Wavelet Convolution Residual-UNet), an enhanced U-Net–based segmentation framework that integrates Wavelet Transform Convolution (WTConv) and an Adaptive Feature Shortcut (AFS) mechanism. WTConv decomposes feature maps into multiple frequency subbands, enabling multiscale contextual representation while preserving critical low-frequency information. The AFS module adaptively enhances discriminative channel features, effectively compensating for feature loss and improving the localization of small hemorrhagic regions. Extensive experiments on an ICH CT dataset show that WRes-UNet consistently outperforms state-of-the-art segmentation models in Dice, IoU, and F1 scores, while achieving lower model complexity. In particular, the proposed framework demonstrates clear advantages in segmenting small and irregular hemorrhagic lesions. These resultsindicate that WRes-UNet provides an effective and robust solution for precise ICH segmentation, showing promising clinical value for early diagnosis. The code for WRes-UNet is available at https://github.com/GZWANGWENYU/WRes-UNet.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 3","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147615232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hybrid 1D CNN-BiLSTM Model for Early Parkinson's Disease Detection From Speech Signals 从语音信号中检测早期帕金森病的混合1D CNN-BiLSTM模型
IF 2.5 4区 计算机科学
International Journal of Imaging Systems and Technology Pub Date : 2026-02-25 DOI: 10.1002/ima.70325
John Kehinde Olawuyi, Rajesh Prasad
{"title":"Hybrid 1D CNN-BiLSTM Model for Early Parkinson's Disease Detection From Speech Signals","authors":"John Kehinde Olawuyi,&nbsp;Rajesh Prasad","doi":"10.1002/ima.70325","DOIUrl":"10.1002/ima.70325","url":null,"abstract":"<div>\u0000 \u0000 <p>Parkinson's Disease (PD) is a chronic neurodegenerative condition characterized by loss of dopaminergic neurons in a specific region of the brain. Symptoms such as hand tremors, walking difficulties, and impaired communication become noticeable in individuals with PD. Given this problem, early and accurate detection of PD remains a key change in clinical practice. The objective of this research is to design a robust and explainable framework for PD detection based on speech signals analysis. A hybrid 1D CNN—BiLSTM framework was designed to capture spatial feature patterns and temporal dependencies. Recursive Feature Elimination (RFE) was applied to select 13 most discriminative speech features, while Synthetic Minority Oversampling Technique (SMOTE) was integrated with 5-fold cross-validation to address class imbalance. Ablation studies assessed the contribution of each model component, and confusion matrix analysis enabled clinical interpretation by quantifying true positives, true negatives, false positives, and false negatives. The experimental findings demonstrated that the proposed 1D CNN—BiLSTM framework achieved strong predictive performance of 92.10% accuracy, 96.43% precision, 93.10% recall, 94.33% F1 score, and clinical reliability with high true positives indicating reliable patient identification and few false negatives reducing risks of missed diagnoses when compared to alternative models. In conclusion, the proposed model demonstrates novelty by integrating explainability, feature selection, and robust validation. Its application provides a non-invasive and reliable framework to support Parkinson's disease screening and early clinical decision making.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 2","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147315545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Post Hoc Interpretability in Swin UNETR-Based Volumetric Segmentation Using Supervoxel Attributions 基于超体素属性的Swin unetrs体积分割的事后可解释性
IF 2.5 4区 计算机科学
International Journal of Imaging Systems and Technology Pub Date : 2026-02-25 DOI: 10.1002/ima.70323
Ankit Srivastava, Sandipan Bhowmick, Munesh Chandra, Ashim Saha
{"title":"Post Hoc Interpretability in Swin UNETR-Based Volumetric Segmentation Using Supervoxel Attributions","authors":"Ankit Srivastava,&nbsp;Sandipan Bhowmick,&nbsp;Munesh Chandra,&nbsp;Ashim Saha","doi":"10.1002/ima.70323","DOIUrl":"10.1002/ima.70323","url":null,"abstract":"<div>\u0000 \u0000 <p>In 3D medical imaging, achieving accurate segmentation using a deep learning model is a vital task, but it is also important to understand how the models produce these results. In the deep learning model, they mostly get high performance, but their inner workings are difficult to understand. The healthcare sector is basically focused on accuracy, and something is missed, such as interpretability and model bias. Mostly, explanation models are designed for 2D data; when they are used in 3D data, they face hurdles in handling its complexity. This paper uses the voxel-level attribution frameworks to focus on which parts of a 3D image are most influential in the model's prediction, and this can be done by using a global binary mask to highlight the most relevant regions and filter out the less important ones. The proposed frameworks use the model-agnostic tool KernelSHAP, which makes the grouping in super-voxel, which reduces the computational load without compromising explanation quality. This combined approach makes it easier to understand how the model works in complex medical scenarios. It provides clear and localized insight into the model decision. This framework supports more transparent and clinically trustworthy applications of deep learning in 3D medical image analysis.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 2","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147288456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DFABU-Net: A Dual Parallel Branch Medical Image Segmentation Model With Attention Mechanism DFABU-Net:一种具有注意机制的双平行分支医学图像分割模型
IF 2.5 4区 计算机科学
International Journal of Imaging Systems and Technology Pub Date : 2026-02-23 DOI: 10.1002/ima.70322
Yanxi Zhang, Tong Liu, Mujun Zang, Jing Gao, Shusen Zhou, Chanjuan Liu, Qingjun Wang
{"title":"DFABU-Net: A Dual Parallel Branch Medical Image Segmentation Model With Attention Mechanism","authors":"Yanxi Zhang,&nbsp;Tong Liu,&nbsp;Mujun Zang,&nbsp;Jing Gao,&nbsp;Shusen Zhou,&nbsp;Chanjuan Liu,&nbsp;Qingjun Wang","doi":"10.1002/ima.70322","DOIUrl":"10.1002/ima.70322","url":null,"abstract":"<div>\u0000 \u0000 <p>Medical image segmentation plays a critical role in computer-aided diagnosis and clinical decision-making; however, the performance of U-Net and its variants is often degraded when dealing with images exhibiting blurred lesion boundaries and pronounced imaging heterogeneity. To address these challenges, this paper proposes DFABU-Net (Double Frame Attention-Based U-Net), a dual-channel segmentation framework designed to enhance boundary perception and robust feature representation. Specifically, a dual-channel encoder architecture is constructed by incorporating a Large-Scale Feature Transmission (LSFT) encoder to capture global contextual and morphological information, while an Attention Contrast Fusion Module (ACFM) is introduced to perform comparative analysis and adaptive fusion of dual-channel features, thereby emphasizing boundary-related information and reducing ambiguity in edge pixel classification. Extensive experiments are conducted on a public benchmark dataset (LiTS) and an internal clinical dataset. On the LiTS dataset, DFABU-Net achieves a Dice coefficient of 98.34% and an mIoU of 96.20%, outperforming representative state-of-the-art methods by 0.44% in Dice and 0.45% in mIoU, respectively. On the internal dataset, DFABU-Net attains a Dice score of 97.64% and an mIoU of 95.88%, achieving the best performance among all compared methods and improving the Dice score by 0.49% over the strongest competing approach. Qualitative comparisons further demonstrate that DFABU-Net produces more accurate and complete lesion boundaries. In addition, independent prognostic analyses based on model-generated segmentations and expert annotations validate the clinical relevance and practical reliability of the proposed framework.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 2","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147299935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Cerebrovascular Segmentation With Explainable Deep Semi-Supervised Learning: A Mean Teacher and Virtual Adversarial Training Approach 用可解释的深度半监督学习增强脑血管分割:一种平均教师和虚拟对抗训练方法
IF 2.5 4区 计算机科学
International Journal of Imaging Systems and Technology Pub Date : 2026-02-22 DOI: 10.1002/ima.70320
Thi-Da-Huong Chau, Ai-Hsien Adam Li, Yen-Jun Lai, Chien-Lung Chan
{"title":"Enhancing Cerebrovascular Segmentation With Explainable Deep Semi-Supervised Learning: A Mean Teacher and Virtual Adversarial Training Approach","authors":"Thi-Da-Huong Chau,&nbsp;Ai-Hsien Adam Li,&nbsp;Yen-Jun Lai,&nbsp;Chien-Lung Chan","doi":"10.1002/ima.70320","DOIUrl":"10.1002/ima.70320","url":null,"abstract":"<div>\u0000 \u0000 <p>Cerebral vascular segmentation is crucial for diagnosing and treating stroke. Although advances in deep learning have significantly enhanced segmentation, supervised models still rely on large, annotated data, coupled with their black-box nature, presenting major challenges in medical settings. This study aims to develop an explainable deep semi-supervised learning framework that incorporates Virtual Adversarial Training (VAT) and the Mean Teacher (MT) model. This approach leverages both labeled and unlabeled datasets to boost segmentation performance while reducing dependency on extensive manual annotations. In addition, techniques from Explainable AI (XAI), including Gradient-weighted Class Activation Mapping (Grad-CAM) and Monte Carlo (MC) Dropout, are incorporated to visualize feature regions and quantify uncertainty, enhancing model interpretability. Experiments reveal that the proposed model outperforms supervised U-Net, the original MT, and the Uncertainty-Aware Mean Teacher (UA_MT) model, achieving a Dice Similarity Coefficient (DSC) of 0.776 with superior generalization on testing data. Grad-CAM visualizations confirm that the model correctly focuses on critical vascular structures, while the uncertainty maps identify areas with potential misclassification, guiding refinement and aiding clinical interpretation.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 2","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147299933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SAM2-FNet: Medical Image Lesion Segmentation Model Based on Frequency Domain Expert Fusion Network 基于频域专家融合网络的医学图像病灶分割模型SAM2-FNet
IF 2.5 4区 计算机科学
International Journal of Imaging Systems and Technology Pub Date : 2026-02-20 DOI: 10.1002/ima.70319
Shaoli Li, Zihua Zhang, Dejian Li, Bin Liu, Luyao He, Siying Guo
{"title":"SAM2-FNet: Medical Image Lesion Segmentation Model Based on Frequency Domain Expert Fusion Network","authors":"Shaoli Li,&nbsp;Zihua Zhang,&nbsp;Dejian Li,&nbsp;Bin Liu,&nbsp;Luyao He,&nbsp;Siying Guo","doi":"10.1002/ima.70319","DOIUrl":"10.1002/ima.70319","url":null,"abstract":"<div>\u0000 \u0000 <p>Recent advances in deep learning have improved medical image lesion segmentation. However, existing approaches remain impaired by the challenge of integrating local detail with global semantic context, which often leads to inaccurate boundaries and limited generalization. We propose SAM2-FNet, a frequency-domain expert fusion network addressing these limitations through three innovative components: (1) The frequency-enhanced ensemble module (FEEM) implements spectral decomposition via 2D Fourier transforms, segregating features into high/low-frequency components. These components undergo specialized processing through parallel branches with differential channel attention mechanisms, followed by adaptive fusion to optimize complementary information integration. (2) The fusion expert module (FEM) employs five lightweight subnetworks configured in a multi-expert architecture. During inference, dynamic weighting of expert outputs enables flexible adaptation to lesion heterogeneity, enhancing robustness across varied pathological presentations. (3) A lightweight frequency–spatial integrator (LFSI) is introduced as a substitutable replacement for standard 3 × 3 convolutions within FEEM. It employs a parameterized selective spatial projector for local receptive field modeling, amplifying salient structural responses and suppressing redundancies via a learnable selection mechanism. (4) The local refine module (LRM) bridges encoder-decoder semantic discrepancies through an atrous convolution pyramid structure, effectively recovering fine structural details during boundary reconstruction. (5) Comprehensive evaluations demonstrate SAM2-FNet's superior performance across standard benchmarks. On ISIC2017, the model achieves absolute improvements of 2.6% in DSC and 3.2% in IoU over baseline implementations. For Kvasir-SEG, performance gains reach 3.8% DSC and 4.8% IoU. On the CVC-ClinicDB dataset, the DSC and IoU care increased by 2.8% and 3.9% respectively. Comparative analysis with state-of-the-art approaches (including U-Mambabot and SwinUNETR) reveals consistent advantages in both DSC and IoU metrics, particularly for complex lesion morphologies. These findings suggest that SAM2-FNet offers a novel and effective solution for medical image lesion segmentation, which has high theoretical value and practical application prospects. The source code is available at https://github.com/niubihonghong12345/SAM2-FNET.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 2","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147288391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pearson Correlation Coefficient-Guided Dynamic Supervision and Dual-Attention Network for MR-to-CT Image Synthesis 基于Pearson相关系数的mr - ct图像合成动态监督与双注意网络
IF 2.5 4区 计算机科学
International Journal of Imaging Systems and Technology Pub Date : 2026-02-19 DOI: 10.1002/ima.70310
Ruiming Zhu, Xinliang Liu, Yin Dai, Wei Qian, Yueyang Teng
{"title":"Pearson Correlation Coefficient-Guided Dynamic Supervision and Dual-Attention Network for MR-to-CT Image Synthesis","authors":"Ruiming Zhu,&nbsp;Xinliang Liu,&nbsp;Yin Dai,&nbsp;Wei Qian,&nbsp;Yueyang Teng","doi":"10.1002/ima.70310","DOIUrl":"10.1002/ima.70310","url":null,"abstract":"<div>\u0000 \u0000 <p>Accurate synthesis of computed tomography (CT) images from magnetic resonance (MR) scans is essential for reducing radiation exposure and enabling fully MR-based workflows in clinical applications such as radiotherapy planning. To address this task, we propose a novel framework—Pearson correlation coefficient-guided dual-attention network (PDANet)—that integrates dynamic supervision and attention-driven representation learning for MR-to-CT image synthesis. PDANet tackles two key limitations in existing methods: fixed loss weighting and insufficient feature modeling. First, a Pearson correlation coefficient-guided dynamic supervision strategy is introduced to adaptively balance pixel-wise and perceptual losses throughout training, which allows the model to emphasize perceptual consistency in early stages and gradually shift toward pixel-level refinement as structural similarity improves. Second, a dual-attention mechanism is incorporated into the generator to enhance long-range dependency modeling and spatially aware feature representation, improving the synthesis of anatomical structures across varying scales. Experimental results on two publicly available datasets—pelvis and brain—demonstrate that PDANet consistently outperforms state-of-the-art methods in terms of structural fidelity and visual quality, highlighting the robustness and generalizability of the approach across diverse anatomical regions.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 2","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147315605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adapting 2D Vision Transformer Backbones for 3D Thoracic Multi-Organ Segmentation 基于二维视觉变换主干的三维胸廓多器官分割
IF 2.5 4区 计算机科学
International Journal of Imaging Systems and Technology Pub Date : 2026-02-19 DOI: 10.1002/ima.70318
Levent Karacan, Hamdi Yalın Yalıç, Alaettin Uçan, Ali Yaşar Yiğit, Adem Ali Yılmaz
{"title":"Adapting 2D Vision Transformer Backbones for 3D Thoracic Multi-Organ Segmentation","authors":"Levent Karacan,&nbsp;Hamdi Yalın Yalıç,&nbsp;Alaettin Uçan,&nbsp;Ali Yaşar Yiğit,&nbsp;Adem Ali Yılmaz","doi":"10.1002/ima.70318","DOIUrl":"10.1002/ima.70318","url":null,"abstract":"<div>\u0000 \u0000 <p>Accurate multi-organ segmentation in thoracic CT scans is essential for radiotherapy planning and clinical diagnosis. However, this task remains challenging due to large anatomical variability, small organ sizes, inter-slice discontinuities, and the computational demands of volumetric segmentation. We propose PVT3D-ThoraxNet, a hybrid 2D–3D framework that integrates a Pyramid Vision Transformer (PVTv2) encoder with a convolutional 3D decoder via a novel 3D Context Encoder, enabling effective fusion of multi-slice features. To further enhance structural consistency, we introduce a 3D Trainable Guided Filter (TGF) in the decoder for boundary refinement. On the Lung CT Segmentation Challenge (LCTSC) dataset across five thoracic organs (esophagus, heart, left lung, right lung, spinal cord), PVT3D-ThoraxNet achieves a mean Dice Similarity Coefficient of 0.903 and a mean HD95 of 3.59 mm. On a private thoracic CT dataset, it generalizes well with a mean Dice of 0.875 and a mean HD95 of 4.81 mm, without dataset-specific fine-tuning. Compared with recent multi-stage and transformer-based approaches, our framework provides a lightweight, robust, and accurate solution for thoracic multi-organ segmentation.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 2","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147299999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
U2-DBA: A Dual-Scale Boundary-Aware Network With Feature-Boundary-Skeleton Loss for Robust Skin Lesion Segmentation 基于特征-边界-骨架损失的双尺度边界感知网络鲁棒皮肤损伤分割
IF 2.5 4区 计算机科学
International Journal of Imaging Systems and Technology Pub Date : 2026-02-17 DOI: 10.1002/ima.70315
Zhiyan Che, Ruyun Chen, Hao Chen, Yonggui Li
{"title":"U2-DBA: A Dual-Scale Boundary-Aware Network With Feature-Boundary-Skeleton Loss for Robust Skin Lesion Segmentation","authors":"Zhiyan Che,&nbsp;Ruyun Chen,&nbsp;Hao Chen,&nbsp;Yonggui Li","doi":"10.1002/ima.70315","DOIUrl":"https://doi.org/10.1002/ima.70315","url":null,"abstract":"<div>\u0000 \u0000 <p>Accurate segmentation of skin lesions is crucial for dependable computer-aided diagnosis of melanoma. However, many existing deep learning models still have difficulty dealing with vague lesion borders, uneven appearance, and unstable performance when applied to new datasets. This paper proposes a dual-scale boundary-aware network (U<sup>2</sup>-DBA) for dermoscopic image segmentation. The model includes a nested U-in-U encoder that captures both local and global features, a dual-branch gating module that balances semantic and structural information, and a decoder that focuses on preserving boundary details. We further propose a novel Feature-Boundary-Skeleton (FBS) loss function, which integrates region overlap, edge gradient, and skeleton-level shape constraints to enhance segmentation accuracy and structural consistency. To evaluate model efficiency, we introduce the Smooth Accuracy-Compactness Score (SACS), combining Dice and IoU metrics with a logarithmic penalty on model size. Experiments conducted on the ISIC 2018 dataset demonstrate that U<sup>2</sup>-DBA achieves high performance (Dice = 0.884, IoU = 0.799) and outperforms six state-of-the-art models in SACS. When directly evaluated on PH2 and HAM10000 without fine-tuning, the model retains strong performance. These findings indicate that U<sup>2</sup>-DBA is not only accurate and compact but also generalizes effectively across diverse datasets, offering a practical and deployable solution for clinical dermoscopic lesion segmentation. The code is available at https://github.com/kid-od/U2-DBA.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 2","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146217239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书