Jierui Gan, Hongqing Zhu, Tianwei Qian, Jiahao Liu, Ning Chen, Ziying Wang
{"title":"FDT-Net: Frequency-Aware Dual-Branch Transformer-Based Optic Cup and Optic Disk Segmentation With Parallel Contour Information Mining and Uncertainty-Guided Refinement","authors":"Jierui Gan, Hongqing Zhu, Tianwei Qian, Jiahao Liu, Ning Chen, Ziying Wang","doi":"10.1002/ima.23199","DOIUrl":"https://doi.org/10.1002/ima.23199","url":null,"abstract":"<div>\u0000 \u0000 <p>Accurate segmentation of the optic cup and disc in fundus images is crucial for the prevention and diagnosis of glaucoma. However, challenges arise due to factors such as blood vessels, and mainstream networks often demonstrate limited capacity in extracting contour information. In this paper, we propose a segmentation framework named FDT-Net, which is based on a frequency-aware dual-branch Transformer (FDBT) architecture with parallel contour information mining and uncertainty-guided refinement. Specifically, we design a FDBT that operates in the frequency domain. This module leverages the inherent contextual awareness of Transformers and utilizes Discrete Cosine Transform (DCT) transformation to mitigate the impact of certain interference factors on segmentation. The FDBT comprises global and local branches that independently extract global and local information, thereby enhancing segmentation results. Moreover, to further mine additional contour information, this study develops the parallel contour information mining (PCIM) module to operate in parallel. These modules effectively capture more details from the edges of the optic cup and disc while avoiding mutual interference, thus optimizing segmentation performance in contour regions. Furthermore, we propose an uncertainty-guided refinement (UGR) module, which generates and quantifies uncertainty mass and leverages it to enhance model performance based on subjective logic theory. The experimental results on two publicly available datasets demonstrate the superior performance and competitive advantages of our proposed FDT-Net. The code for this project is available at https://github.com/Rookie49144/FDT-Net.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142524715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"M-Net: A Skin Cancer Classification With Improved Convolutional Neural Network Based on the Enhanced Gray Wolf Optimization Algorithm","authors":"Zhinan Xu, Xiaoxia Zhang, Luzhou Liu","doi":"10.1002/ima.23202","DOIUrl":"https://doi.org/10.1002/ima.23202","url":null,"abstract":"<div>\u0000 \u0000 <p>Skin cancer is a common malignant tumor causing tens of thousands of deaths each year, making early detection essential for better treatment outcomes. However, the similar visual characteristics of skin lesions make it challenging to accurately differentiate between lesion types. With advancements in deep learning, researchers have increasingly turned to convolutional neural networks for skin cancer detection and classification. In this article, an improved skin cancer classification model M-Net is proposed, and the enhanced gray wolf optimization algorithm is combined to improve the classification performance. The gray wolf optimization algorithm guides the wolf pack to prey through a multileader structure and gradually converges through the encirclement and pursuit mechanism, so as to perform a more detailed search in the later stage. To further improve the performance of the gray wolf optimization, this study introduces the simulated annealing algorithm to avoid falling into the local optimal state and expands the search range by improving the search mechanism, thus enhancing the global optimization ability of the algorithm. The M-Net model significantly improves the accuracy of classification by extracting features of skin lesions and optimizing parameters with the enhanced gray wolf optimization algorithm. The experimental results based on the ISIC 2018 dataset show that compared with the baseline model, the feature extraction network of the model has achieved a significant improvement in accuracy. The classification performance of M-Net is excellent in multiple indicators, with accuracy, precision, recall, and F1 score reaching 0.891, 0.857, 0.895, and 0.872, respectively. In addition, the modular design of M-Net enables it to flexibly adjust feature extraction and classification modules to adapt to different classification tasks, showing great scalability and applicability. In general, the model proposed in this article performs well in the classification of skin lesions, has broad clinical application prospects, and provides strong support for promoting the diagnosis of skin diseases.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142451158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sajid Ullah Khan, Meshal Alharbi, Sajid Shah, Mohammed ELAffendi
{"title":"Medical Image Fusion for Multiple Diseases Features Enhancement","authors":"Sajid Ullah Khan, Meshal Alharbi, Sajid Shah, Mohammed ELAffendi","doi":"10.1002/ima.23197","DOIUrl":"https://doi.org/10.1002/ima.23197","url":null,"abstract":"<div>\u0000 \u0000 <p>Throughout the past 20 years, medical imaging has found extensive application in clinical diagnosis. Doctors may find it difficult to diagnose diseases using only one imaging modality. The main objective of multimodal medical image fusion (MMIF) is to improve both the accuracy and quality of clinical assessments by extracting structural and spectral information from source images. This study proposes a novel MMIF method to assist doctors and postoperations such as image segmentation, classification, and further surgical procedures. Initially, the intensity-hue-saturation (IHS) model is utilized to decompose the positron emission tomography (PET)/single photon emission computed tomography (SPECT) image, followed by a hue-angle mapping method to discriminate high- and low-activity regions in the PET images. Then, a proposed structure feature adjustment (SFA) mechanism is used as a fusion strategy for high- and low-activity regions to obtain structural and anatomical details with minimum color distortion. In the second step, a new multi-discriminator generative adversarial network (MDcGAN) approach is proposed for obtaining the final fused image. The qualitative and quantitative results demonstrate that the proposed method is superior to existing MMIF methods in preserving the structural, anatomical, and functional details of the PET/SPECT images. Through our assessment, involving visual analysis and subsequent verification using statistical metrics, it becomes evident that color changes contribute substantial visual information to the fusion of PET and MR images. The quantitative outcomes demonstrate that, in the majority of cases, the proposed algorithm consistently outperformed other methods. Yet, in a few instances, it achieved the second-highest results. The validity of the proposed method was confirmed using diverse modalities, encompassing a total of 1012 image pairs.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142449098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimization and Application Analysis of Phase Correction Method Based on Improved Image Registration in Ultrasonic Image Detection","authors":"Nannan Lu, Hongyan Shu","doi":"10.1002/ima.23185","DOIUrl":"https://doi.org/10.1002/ima.23185","url":null,"abstract":"<p>In order to prevent and detect a wide range of disorders, including those of the brain, thoracic, digestive, urogenital, and cardiovascular systems, ultrasound technology is essential for assessing physiological data and tissue morphology. Its capacity to deliver real-time, high-frequency scans makes it a handy and non-invasive diagnostic tool. However, issues like patient movements and probe jitter from human error can provide a large amount of interference, resulting in inaccurate test findings. Techniques for image registration can assist in locating and eliminating unwanted interference while maintaining crucial data. Even though there has been research on improving these techniques in Matlab, there are no specialized systems for interference removal, and the procedure is still time-consuming, particularly when working with huge quantities of ultrasound images. The phase correlation technique, which converts images into the frequency domain and makes noise suppression easier, is one of the most efficient algorithms now in use since it can tolerate noise with resilience. Nevertheless, little research has been done on using this technique to identify displacement in blood vessel wall ultrasound images. To address these gaps, this work presents an image registration system that uses the phase correlation algorithm. The system provides rotation, zoom registration, picture translation, and displacement detection of the vessel wall in addition to interference removal. Furthermore, batch processing is included to increase the effectiveness of registering multiple ultrasound pictures. Through efficient interference management and streamlined registration, this method offers a workable way to improve the precision and efficacy of ultrasonic diagnostics.</p>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.23185","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142429964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Said Charfi, Mohamed EL Ansari, Lahcen Koutti, Ilyas ELjaafari, Ayoub ELLahyani
{"title":"Feature Pyramid Network Based Spatial Attention and Cross-Level Semantic Similarity for Diseases Segmentation From Capsule Endoscopy Images","authors":"Said Charfi, Mohamed EL Ansari, Lahcen Koutti, Ilyas ELjaafari, Ayoub ELLahyani","doi":"10.1002/ima.23194","DOIUrl":"https://doi.org/10.1002/ima.23194","url":null,"abstract":"<div>\u0000 \u0000 <p>As an emerging technology that uses a pill-sized camera to visualize images of the digestive tract. Wireless capsule endoscopy (WCE) presents several advantages, since it is far less invasive, does not need sedation and has less possible complications compared to standard endoscopy. Hence, it might be exploited as alternative to the standard procedure. WCE is used to diagnosis a variety of gastro-intestinal diseases such as polyps, ulcers, crohns disease, and hemorrhages. Nevertheless, WCE videos produced after a test may consist of thousands of frames per patient that must be viewed by medical specialists, besides, the capsule free mobility and technological limits cause production of a low quality images. Hence, development of an automatic tool based on artificial intelligence might be very helpful. Moreover, most state-of-the-art works aim at images classification (normal/abnormal) while ignoring diseases segmentation. Therefore, in this work a novel method based on Feature Pyramid Network model is presented. This approach aims at diseases segmentation from WCE images. In this model, modules to optimize and combine features were employed. Specifically, semantic and spatial features were mutually compensated by spatial attention and cross-level global feature fusion modules. The proposed method testing F1-score and mean intersection over union are 94.149% and 89.414%, respectively, in the MICCAI 2017 dataset. In the KID Atlas dataset, the method achieved a testing F1-score and mean intersection over union of 94.557% and 90.416%, respectively. Through the performance analysis, the mean intersection over union in the MICCAI 2017 dataset is 20.414%, 18.484%, 11.444%, 8.794% more than existing approaches. Moreover, the proposed scheme surpassed the methods used for comparison by 29.986% and 9.416% in terms of mean intersection over union in KID Atlas dataset. These results indicate that the proposed approach is promising in diseases segmentation from WCE images.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142429356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Solange Doumun OULAI, Sophie Dabo-Niang, Jérémie Zoueu
{"title":"A Multispectral Blood Smear Background Images Reconstruction for Malaria Unstained Images Normalization","authors":"Solange Doumun OULAI, Sophie Dabo-Niang, Jérémie Zoueu","doi":"10.1002/ima.23182","DOIUrl":"https://doi.org/10.1002/ima.23182","url":null,"abstract":"<p>Multispectral and multimodal unstained blood smear images are obtained and evaluated to offer computer-assisted automated diagnostic evidence for malaria. However, these images suffer from uneven lighting, contrast variability, and local luminosity due to the acquisition system. This limitation significantly impacts the diagnostic process and its overall outcomes. To overcome this limitation, it is crucial to perform normalization on the acquired multispectral images as a preprocessing step for malaria parasite detection. In this study, we propose a novel method for achieving this normalization, aiming to improve the accuracy and reliability of the diagnostic process. This method is based on estimating the Bright reference image, which captures the luminosity, and the contrast variability function from the background region of the image. This is achieved through two distinct resampling methodologies, namely Gaussian random field simulation by variogram analysis and Bootstrap resampling. A method for handling the intensity saturation issue of certain pixels is also proposed, which involves outlier imputation. Both of these proposed approaches for image normalization are demonstrated to outperform existing methods for multispectral and multimodal unstained blood smear images, as measured by the Structural Similarity Index Measure (SSIM), Mean Squared Error (MSE), Zero mean Sum of Absolute Differences (ZSAD), Peak Signal to Noise Ratio (PSNR), and Absolute Mean Brightness Error (AMBE). These methods not only improve the image contrast but also preserve its spectral footprint and natural appearance more accurately. The normalization technique employing Bootstrap resampling significantly reduces the acquisition time for multimodal and multispectral images by 66%. Moreover, the processing time for Bootstrap resampling is less than 4% of the processing time required for Gaussian random field simulation.</p>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.23182","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142429469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhanced Lung Cancer Diagnosis and Staging With HRNeT: A Deep Learning Approach","authors":"N. Rathan, S. Lokesh","doi":"10.1002/ima.23193","DOIUrl":"https://doi.org/10.1002/ima.23193","url":null,"abstract":"<div>\u0000 \u0000 <p>The healthcare industry has been significantly impacted by the widespread adoption of advanced technologies such as deep learning (DL) and artificial intelligence (AI). Among various applications, computer-aided diagnosis has become a critical tool to enhance medical practice. In this research, we introduce a hybrid approach that combines a deep neural model, data collection, and classification methods for CT scans. This approach aims to detect and classify the severity of pulmonary disease and the stages of lung cancer. Our proposed lung cancer detector and stage classifier (LCDSC) demonstrate greater performance, achieving higher accuracy, sensitivity, specificity, recall, and precision. We employ an active contour model for lung cancer segmentation and high-resolution net (HRNet) for stage classification. This methodology is validated using the industry-standard benchmark image dataset lung image database consortium and image database resource initiative (LIDC-IDRI). The results show a remarkable accuracy of 98.4% in classifying lung cancer stages. Our approach presents a promising solution for early lung cancer diagnosis, potentially leading to improved patient outcomes.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142429272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MFH-Net: A Hybrid CNN-Transformer Network Based Multi-Scale Fusion for Medical Image Segmentation","authors":"Ying Wang, Meng Zhang, Jian'an Liang, Meiyan Liang","doi":"10.1002/ima.23192","DOIUrl":"https://doi.org/10.1002/ima.23192","url":null,"abstract":"<div>\u0000 \u0000 <p>In recent years, U-Net and its variants have gained widespread use in medical image segmentation. One key aspect of U-Net's design is the skip connection, facilitating the retention of detailed information and leading to finer segmentation results. However, existing research often concentrates on enhancing either the encoder or decoder, neglecting the semantic gap between them, and resulting in suboptimal model performance. In response, we introduce Multi-Scale Fusion module aimed at enhancing the original skip connections and addressing the semantic gap. Our approach fully incorporates the correlation between outputs from adjacent encoder layers and facilitates bidirectional information exchange across multiple layers. Additionally, we introduce Channel Relation Perception module to guide the fused multi-scale information for efficient connection with decoder features. These two modules collectively bridge the semantic gap by capturing spatial and channel dependencies in the features, contributing to accurate medical image segmentation. Building upon these innovations, we propose a novel network called MFH-Net. On three publicly available datasets, ISIC2016, ISIC2017, and Kvasir-SEG, we perform a comprehensive evaluation of the network. The experimental results show that MFH-Net exhibits higher segmentation accuracy in comparison with other competing methods. Importantly, the modules we have devised can be seamlessly incorporated into various networks, such as U-Net and its variants, offering a potential avenue for further improving model performance.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142428873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RetNet30: A Novel Stacked Convolution Neural Network Model for Automated Retinal Disease Diagnosis","authors":"Krishnakumar Subramaniam, Archana Naganathan","doi":"10.1002/ima.23187","DOIUrl":"https://doi.org/10.1002/ima.23187","url":null,"abstract":"<div>\u0000 \u0000 <p>Automated diagnosis of retinal diseases holds significant promise in enhancing healthcare efficiency and patient outcomes. However, existing methods often lack the accuracy and efficiency required for timely disease detection. To address this gap, we introduce RetNet30, a novel stacked convolutional neural network (CNN) designed to revolutionize automated retinal disease diagnosis. RetNet30 combines a custom-built 30-layer CNN with a fine-tuned Inception V3 model, integrating these sub-models through logistic regression to achieve superior classification performance. Extensive evaluations on retinal image datasets such as DRIVE, STARE, CHASE_DB1, and HRF demonstrate significant improvements in accuracy, sensitivity, specificity, and area under the ROC curve (AUROC) when compared to conventional approaches. By leveraging advanced deep learning architectures, RetNet30 not only enhances diagnostic precision but also generalizes effectively across diverse datasets, establishing a new benchmark in retinal disease classification. This novel approach offers a highly efficient and reliable solution for early disease detection and patient management, addressing the limitations of manual examination methods. Through rigorous quantitative and qualitative assessments, our proposed method demonstrates its potential to significantly impact medical image analysis and improve healthcare outcomes. RetNet30 marks a major step forward in automated retinal disease diagnosis, showcasing the future of AI-driven advancements in ophthalmology.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 5","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142316971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cross-Layer Connection SegFormer Attention U-Net for Efficient TRUS Image Segmentation","authors":"Yongtao Shi, Wei Du, Chao Gao, Xinzhi Li","doi":"10.1002/ima.23178","DOIUrl":"https://doi.org/10.1002/ima.23178","url":null,"abstract":"<div>\u0000 \u0000 <p>Accurately and rapidly segmenting the prostate in transrectal ultrasound (TRUS) images remains challenging due to the complex semantic information in ultrasound images. The paper discusses a cross-layer connection with SegFormer attention U-Net for efficient TRUS image segmentation. The SegFormer framework is enhanced by reducing model parameters and complexity without sacrificing accuracy. We introduce layer-skipping connections for precise positioning and combine local context with global dependency for superior feature recognition. The decoder is improved with Multi-layer Perceptual Convolutional Block Attention Module (MCBAM) for better upsampling and reduced information loss, leading to increased accuracy. The experimental results show that compared with classic or popular deep learning methods, this method has better segmentation performance, with the dice similarity coefficient (DSC) of 97.55% and the intersection over union (IoU) of 95.23%. This approach balances encoder efficiency, multi-layer information flow, and parameter reduction.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 5","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142316942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}