International Journal of Imaging Systems and Technology最新文献_第8页

Carotid Artery Plague Segmentation Model Based on Dual-Modal 基于双模态的颈动脉鼠疫分割模型

IF 3 4区计算机科学

International Journal of Imaging Systems and Technology Pub Date : 2025-06-30 DOI: 10.1002/ima.70149

Chun He, Zhanquan Sun, Man Chen, Yunqian Huang

{"title":"Carotid Artery Plague Segmentation Model Based on Dual-Modal","authors":"Chun He, Zhanquan Sun, Man Chen, Yunqian Huang","doi":"10.1002/ima.70149","DOIUrl":"https://doi.org/10.1002/ima.70149","url":null,"abstract":"<div>\u0000 \u0000 <p>Ultrasonography (US) and contrast-enhanced ultrasound (CEUS) are effective imaging tools for analyzing the spatial and temporal characteristics of lesions and diagnosing or predicting diseases. At the same time, US is characterized by blurred boundaries and strong noise interference. Therefore, evaluating plaques and depicting lesions frame-by-frame is a time-consuming task, which poses a challenge in analyzing US videos using deep learning techniques. However, despite the existing methods for US and CEUS image segmentation, there are still limited approaches capable of integrating the feature information from these two distinct image types. Furthermore, these methods require additional optimization to enhance their capacity for extracting comprehensive global contextual information. To address the problem, we propose a U-shaped structured network model based on Transformer in this paper. The network is composed of two parts, that is, the dual-modal information interaction fusion module and the enhanced feature extraction module. The first module is used to extract comprehensive US and CEUS features and fuse them at multiple scales. The second module is used to enhance feature extraction capabilities. This network enables precise localization of the lesion and clear depiction of the region of interest in US. Our model achieved a Dice of 91.62% and an IoU of 88.04% on the carotid plaque segmentation dataset. The experimental results show that the performance of our designed network on the carotid artery dataset is better than that of the SOTA models.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 4","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144515005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

BFR-Unet: A Full-Resolution Model for Efficient Segmentation of Tiny Blood Vessels BFR-Unet：一种有效分割微小血管的全分辨率模型

IF 3 4区计算机科学

International Journal of Imaging Systems and Technology Pub Date : 2025-06-27 DOI: 10.1002/ima.70148

Feng Liu, Jipeng Sun

{"title":"BFR-Unet: A Full-Resolution Model for Efficient Segmentation of Tiny Blood Vessels","authors":"Feng Liu, Jipeng Sun","doi":"10.1002/ima.70148","DOIUrl":"https://doi.org/10.1002/ima.70148","url":null,"abstract":"<div>\u0000 \u0000 <p>Retinal blood vessel segmentation plays a crucial role in diagnosing retinal diseases, where accurate and complete vessel segmentation is essential for reliable diagnosis. Currently, U-Net remains one of the most widely used architectures for retinal blood vessel segmentation. However, due to the complexity and variability of retinal structures, the blood vessel edges are often very thin, and the low contrast of retinal images further complicates accurate segmentation. These challenges frequently result in U-Net models failing to precisely capture vessel boundaries. To address this issue, a novel full-resolution retinal blood vessel segmentation network, termed BFR-Net, is proposed. The BFR-Net is composed of three primary modules: the multi-residual convolution module, the boundary attention module, and the feature fusion module. The multi-residual convolution module, forming the backbone of the network, enables effective extraction of contextual information across the full resolution. The boundary attention module processes outputs from both the backbone and different network levels to capture detailed edge features, thus enhancing the segmentation performance. Finally, the feature fusion module integrates features from the backbone and boundary attention modules, further improving overall network performance. The performance of the proposed model is evaluated on three commonly used retinal vessel segmentation datasets. Experimental results demonstrate that BFR-Net achieves advanced performance, particularly in segmenting vessel edges and small blood vessels. Specifically, on the DRIVE and CHSAE_DB1 datasets, the Se and F1 scores are 0.8646, 0.8244, 0.8838, and 0.8108, respectively. These results demonstrate that the proposed network exhibits excellent performance in segmenting vessel boundaries and fine vessels.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 4","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144492789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Improving Stroke Segmentation and Classification Performance Using a Goal-Oriented Deep Learning Framework 使用面向目标的深度学习框架改进笔画分割和分类性能

IF 3 4区计算机科学

International Journal of Imaging Systems and Technology Pub Date : 2025-06-27 DOI: 10.1002/ima.70147

Büşra Uygun, Ayşe Demirhan

{"title":"Improving Stroke Segmentation and Classification Performance Using a Goal-Oriented Deep Learning Framework","authors":"Büşra Uygun, Ayşe Demirhan","doi":"10.1002/ima.70147","DOIUrl":"https://doi.org/10.1002/ima.70147","url":null,"abstract":"<div>\u0000 \u0000 <p>CT scans play a crucial role in diagnosing and planning treatment for strokes, offering essential insights into the location, size, and extent of bleeding in brain tissue. This study explores two distinct scenarios for stroke detection, classification, and segmentation, utilizing 6951 brain CT images from the TEKNOFEST competition. In both scenarios, CT images undergo preprocessing steps involving skull-stripping, normalization, and image augmentation. In the first scenario, stroke presence-absence classification achieved a 98% success rate on test images. Subsequent segmentation of images with strokes resulted in Dice scores of 59% for ischemic stroke and 67% for hemorrhagic stroke on test images. The classification of stroke types as ischemic and hemorrhagic achieved a 100% success rate, with a 97% success rate when directly classifying stroke types in images without segmentation. This indicates a 3% performance improvement when applying the classification process after stroke region segmentation. In the second scenario, a three-class classification of no stroke, ischemic stroke, and hemorrhagic stroke achieved an average of 97% success on test images. Post-classification, separately created models for the segmentation of ischemic and hemorrhagic strokes yielded Dice scores of 78% and 79%, respectively. The second scenario demonstrated a performance improvement of 19% and 12% for the segmentation of ischemic and hemorrhagic strokes through the post-classification segmentation process. The proposed approach outperforms competing teams in the competition rankings.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 4","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144493001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Research on Lung Sound Signal Image Feature Recognition Based on Temporal and Spatial Dual-Channel Long- and Short-Term Memory Model 基于时空双通道长短期记忆模型的肺声信号图像特征识别研究

IF 3 4区计算机科学

International Journal of Imaging Systems and Technology Pub Date : 2025-06-23 DOI: 10.1002/ima.70141

Li Xueri, Hu Ruo, Xu Hong, Zhao Huimin

{"title":"Research on Lung Sound Signal Image Feature Recognition Based on Temporal and Spatial Dual-Channel Long- and Short-Term Memory Model","authors":"Li Xueri, Hu Ruo, Xu Hong, Zhao Huimin","doi":"10.1002/ima.70141","DOIUrl":"https://doi.org/10.1002/ima.70141","url":null,"abstract":"<p>In this paper, through the study on the transformation of lung sound signal into image feature signal processing, we further mastered the processing process of lung sound signal, and used the new neural network model to identify and diagnose the image features of lung sound, effectively improving the effect of clinical AI-assisted diagnosis. To solve the problem that the traditional neural network model cannot obtain the temporal and spatial characteristics of lung sound signals at the same time, we propose a DCCLSTM (Dual-Channel Convolutional neural network for Long- and Short-Time Memory) to obtain spatial information and temporal information features of lung sound simultaneously. New features are generated by weighted fusion, which can effectively make up for the problem that the resolution of the feature map extracted by the traditional neural network model is reduced. This report presents the results of studies conducted on the lung sound dataset, and the accuracy rate of Dalal_CNN with the best effect was 89.56%. The DCCLSTM proposed in this study has a recognition accuracy of 97.40%. Experiments show that the DCCLSTM method is more accurate than the Dalal_CNN method.</p>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 4","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.70141","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144339541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

LiteVessel: In-Depth Exploration of Lightweight Deep Neural Network Models for Retinal Vessel Segmentation LiteVessel：用于视网膜血管分割的轻量级深度神经网络模型的深入探索

IF 3 4区计算机科学

International Journal of Imaging Systems and Technology Pub Date : 2025-06-23 DOI: 10.1002/ima.70145

Musaed Alhussein, Khursheed Aurangzeb, Kashif Fareed, Mazhar Islam, Rasha Sarhan Alharthi

{"title":"LiteVessel: In-Depth Exploration of Lightweight Deep Neural Network Models for Retinal Vessel Segmentation","authors":"Musaed Alhussein, Khursheed Aurangzeb, Kashif Fareed, Mazhar Islam, Rasha Sarhan Alharthi","doi":"10.1002/ima.70145","DOIUrl":"https://doi.org/10.1002/ima.70145","url":null,"abstract":"<div>\u0000 \u0000 <p>Deep learning has been used over the past decade for diagnosis applications in healthcare including ophthalmology. The integration of deep learning models with embedded systems to attain real-time processing of diagnosis becomes ineffective due to the resource constraints of embedded systems and higher computation and memory requirements of DNNs. To overcome this issue, this work aims to optimize an encoder–decoder architecture to demonstrate the potential for porting a DL model to any general embedded platform for eye disease diagnosis in the early stage. In this paper, we tested different model architectures to reduce the computation complexity of the DL model without compromising performance metrics. To train and test our optimized models, we utilized available databases of retinal images such as DRIVE, CHASE_DB1, and STARE. Although the computational complexity was much lower, the developed models achieved competitive performance compared with the existing state-of-the-art. Furthermore, we implemented a cross-training approach, and the findings illustrate the generalizability and resilience of the methods presented. The reduced number of parameters, computational complexity, and enhanced segmentation performance of retinal vessel segmentation make the proposed methods suitable for use in automated diagnostic systems.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 4","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144339537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

TAU-EffNetB7: A Novel Triple Attention U-Net Approach Using EfficientNetB7 for Enhanced Polyp Segmentation TAU-EffNetB7：一种利用effentnetb7增强息肉分割的新型三重注意力U-Net方法

IF 3 4区计算机科学

International Journal of Imaging Systems and Technology Pub Date : 2025-06-21 DOI: 10.1002/ima.70144

Fouzia El Abassi, Aziz Darouichi, Aziz Ouaarab

{"title":"TAU-EffNetB7: A Novel Triple Attention U-Net Approach Using EfficientNetB7 for Enhanced Polyp Segmentation","authors":"Fouzia El Abassi, Aziz Darouichi, Aziz Ouaarab","doi":"10.1002/ima.70144","DOIUrl":"https://doi.org/10.1002/ima.70144","url":null,"abstract":"<div>\u0000 \u0000 <p>Polyp segmentation is a critical but challenging process in clinical imaging since colonoscopic images are inherently complex and heterogeneous. Conventional single-stage segmentation networks lack good generalization and achieve only acceptable accuracy, particularly for small or uncertain polyps. To address these constraints, we propose two new models: TAU-EffNetB7 and TAU-EffNetB7 + Residual. These models apply triple-attention U-Net and triple-attention residual architectures, respectively, and incorporate cascaded stages, attention and residual operations, Atrous Spatial Pyramid Pooling, and transfer learning from EfficientNetB7. The multi-stage architecture enables progressive refinement of segmentations, better capture of multi-scale features, and accurate depiction of intricate boundaries. We evaluate our models on three publicly available colonoscopic datasets: Kvasir-SEG, CVC-ClinicDB, and CVC-ColonDB. The TAU-EffNetB7 attains Dice Similarity Coefficients (DSC) of 89.54%, 94.62%, and 94.68% on each dataset, respectively. The TAU-EffNetB7 + Residual model performs even better, achieving DSCs of 91.11%, 93.74%, and 94.72%, significantly outperforming baseline models such as U-Net and Attention U-Net. To assess generalization, we carry out experiments where models are trained with small subsets of data (Kvasir-SEG1, CVC-ClinicDB1, and CVC-ColonDB1) and tested on the full datasets. Both models demonstrate strong performance even with limited training data. TAU-EffNetB7 achieves 90.18% DSC when trained on Kvasir-SEG1, whereas TAU-EffNetB7 + Residual achieves 94.17% on CVC-ClinicDB and 94.68% on CVC-ColonDB when trained on their respective subsets. Notably, the residual-augmented model outperforms its counterpart in all but a few low-data scenarios.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 4","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144332014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

3D MedicalDet-Mamba: A Hybrid Mamba-CNN Network for Medical Object Detection and Localization 3D MedicalDet-Mamba：用于医疗对象检测和定位的混合Mamba-CNN网络

IF 3 4区计算机科学

International Journal of Imaging Systems and Technology Pub Date : 2025-06-19 DOI: 10.1002/ima.70139

Shanshan Li, Zijie Shen, Yuhan Zhang, Hua Lai, Song Tan, Wei Chen

{"title":"3D MedicalDet-Mamba: A Hybrid Mamba-CNN Network for Medical Object Detection and Localization","authors":"Shanshan Li, Zijie Shen, Yuhan Zhang, Hua Lai, Song Tan, Wei Chen","doi":"10.1002/ima.70139","DOIUrl":"https://doi.org/10.1002/ima.70139","url":null,"abstract":"<div>\u0000 \u0000 <p>3D object detection in medical imaging poses significant challenges due to the high dimensionality and complex spatial relationships of volumetric data. Recent advancements with convolutional neural network (CNN)- and transformer-based approaches have shown promise; however, CNNs struggle with capturing long-range dependencies, while transformers incur high computational and memory costs when handling high-resolution 3D medical images. Mamba-based models offer an efficient alternative by modeling long-range dependencies in a linear manner, reducing complexity while maintaining effective feature representation. This study introduces 3D MedicalDet-Mamba, a novel hybrid framework that integrates the complementary strengths of CNNs and Mamba for precise 3D medical object detection and localization. Specifically, we propose the locality-integrated Mamba (LIM) module, which combines parallel multi-kernel convolutions with Mamba-based blocks to capture both global dependencies and fine-grained local structures, ensuring a more comprehensive feature representation. Additionally, we introduce the inter-scale aggregation Mamba (ISAM) block, a Mamba-based component that leverages hexa-hierarchical 3D slice (HH3S) scanning to aggregate multi-scale voxel-level features. This mechanism enhances the separation of medical objects from complex backgrounds while improving global feature extraction efficiency. Experimental results on public datasets show that 3D MedicalDet-Mamba outperforms state-of-the-art methods in both detection and localization accuracy. Code is available at https://github.com/ssli23/3D-MedicalDet-Mamba.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 4","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144315061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Microaneurysm Segmentation Method for Diabetic Retinopathy Fundus Lesions Based on the Multi-Scale Feature Subtraction Fusion Network 基于多尺度特征减法融合网络的糖尿病视网膜病变眼底微动脉瘤分割方法

IF 3 4区计算机科学

International Journal of Imaging Systems and Technology Pub Date : 2025-06-18 DOI: 10.1002/ima.70140

Jie Zhang, Xiangyu Jiang, Shuting Ni, Shuang Liu, Wei Zou

{"title":"Microaneurysm Segmentation Method for Diabetic Retinopathy Fundus Lesions Based on the Multi-Scale Feature Subtraction Fusion Network","authors":"Jie Zhang, Xiangyu Jiang, Shuting Ni, Shuang Liu, Wei Zou","doi":"10.1002/ima.70140","DOIUrl":"https://doi.org/10.1002/ima.70140","url":null,"abstract":"<div>\u0000 \u0000 <p>Detecting and segmenting microaneurysms can help doctors diagnose the condition and formulate subsequent treatment plans. A multi-scale feature subtraction fusion network is proposed in this paper. It includes two modules: the multi-scale feature compensation module and the subtraction fusion module. In the multi-scale feature compensation module, the features between adjacent levels of the network are fused. Considering that simply concatenating features may lead to feature redundancy, a subtraction fusion module is designed. To enable the neural network to extract more detailed information, a branch is introduced. A wavelet attention enhancement module is designed to transform the channel attention of frequency coefficients extracted by wavelet transform. The proposed method can help the network learn feature diversity better, and hence can improve segmentation performance. Experimental results show that, as compared to the existing methods, the proposed method can achieve better performance with Dice coefficients of 0.4481, 0.4860, and 0.3561 on the IDRID, E-Ophtha, and DDR datasets, respectively.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 4","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144308715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fully Automated Mandibular Condyle Segmentation: More Detailed Extraction With Hybrid Customized SAM 全自动下颌髁分割：更详细的提取与混合定制SAM

IF 3 4区计算机科学

International Journal of Imaging Systems and Technology Pub Date : 2025-06-17 DOI: 10.1002/ima.70138

Zihang Huang, Yaning Feng, Lilin Guo, Qiutao Shi, Wei Jin

{"title":"Fully Automated Mandibular Condyle Segmentation: More Detailed Extraction With Hybrid Customized SAM","authors":"Zihang Huang, Yaning Feng, Lilin Guo, Qiutao Shi, Wei Jin","doi":"10.1002/ima.70138","DOIUrl":"https://doi.org/10.1002/ima.70138","url":null,"abstract":"<div>\u0000 \u0000 <p>Accurate segmentation of the mandibular condyle is a key step in three-dimensional reconstruction, which is clinically crucial for digital surgical planning in oral and maxillofacial surgery. Quantitative analysis of its volume and morphology can provide an objective basis for preoperative assessment and postoperative efficacy evaluation. Although many deep learning-based approaches have achieved remarkable success, several challenges persist. Current methods are constrained by low-resolution global image maps, produce masks with blurred boundaries, and require large datasets to ensure accuracy and robustness. To address these challenges, we propose a novel framework for condylar segmentation by adapting the “Segmentation Anything Model” (SAM) to cone beam computed tomography (CBCT) imaging data, with targeted architectural optimizations to enhance segmentation accuracy and boundary delineation. Our framework introduces two novel architectural components: (1) a dual-adapter system combining feature augmentation and transformer-level prompt enhancement to improve target-specific contextual learning, and (2) a boundary-optimized loss function that prioritizes anatomical edge fidelity. For clinical practicality, we further develop ConDetector to enable fully automated prompting without manual intervention. Through extensive experiments, we have shown that our adapted SAM (using Ground Truth as a prompt) achieves state-of-the-art performance, reaching a Dice coefficient of 94.73% on a relatively small sample set. The fully automated SAM even achieves the second-best segmentation performance, with a Dice coefficient of 94.00%. Our approach exhibits robust segmentation capabilities and achieves excellent performance even with limited training data.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 4","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144300292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

TPA-Seg: Multi-Class Nucleus Segmentation Using Text Prompts and Cross-Attention TPA-Seg：使用文本提示和交叉注意的多类核分割

IF 3 4区计算机科学

International Journal of Imaging Systems and Technology Pub Date : 2025-06-13 DOI: 10.1002/ima.70125

Yao-Ming Liang, Shi-Yu Lin, Zu-Xuan Wang, Ling-Feng Yang, Yi-Bo Jin, Yan-Hong Ji

{"title":"TPA-Seg: Multi-Class Nucleus Segmentation Using Text Prompts and Cross-Attention","authors":"Yao-Ming Liang, Shi-Yu Lin, Zu-Xuan Wang, Ling-Feng Yang, Yi-Bo Jin, Yan-Hong Ji","doi":"10.1002/ima.70125","DOIUrl":"https://doi.org/10.1002/ima.70125","url":null,"abstract":"<div>\u0000 \u0000 <p>Precise semantic segmentation of nuclei in pathological images is a crucial step in pathological diagnosis and analysis. Given the limited scale and the high cost of annotation for current pathological datasets, appropriately incorporating textual prompts as prior knowledge is key to achieving high-accuracy multi-class segmentation. These text prompts can be derived from image information such as the morphology, size, location, and density of nuclei in medical images. The text prompts are processed by a text encoder to obtain textual features, while the images are processed by an image encoder to obtain multi-scale feature maps. These features are then fused through feature fusion blocks, allowing the features to interact and be perceived in a multi-scale multimodal manner. Finally, metric learning and weighted loss functions are introduced to prevent feature loss caused by a small number of categories or small target sizes in the image. Experimental results on multiple pathological image datasets demonstrate that our method is effective and outperforms existing models in the segmentation of pathological images. Furthermore, the study verifies the effectiveness of each module and evaluates the potential of different types of text prompts in improving performance. The insights and methods proposed may offer a novel solution for segmentation and classification tasks. The code can be viewed at https://github.com/kahhh743/TPA-Seg.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 4","pages":""},"PeriodicalIF":3.0,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144281540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0