Andrés Larroza, Francisco Javier Pérez-Benito, Raquel Tendero, Juan Carlos Perez-Cortes, Marta Román, Rafael Llobet
{"title":"Three-Blind Validation Strategy of Deep Learning Models for Image Segmentation.","authors":"Andrés Larroza, Francisco Javier Pérez-Benito, Raquel Tendero, Juan Carlos Perez-Cortes, Marta Román, Rafael Llobet","doi":"10.3390/jimaging11050170","DOIUrl":"10.3390/jimaging11050170","url":null,"abstract":"<p><p>Image segmentation plays a central role in computer vision applications such as medical imaging, industrial inspection, and environmental monitoring. However, evaluating segmentation performance can be particularly challenging when ground truth is not clearly defined, as is often the case in tasks involving subjective interpretation. These challenges are amplified by inter- and intra-observer variability, which complicates the use of human annotations as a reliable reference. To address this, we propose a novel validation framework-referred to as the three-blind validation strategy-that enables rigorous assessment of segmentation models in contexts where subjectivity and label variability are significant. The core idea is to have a third independent expert, blind to the labeler identities, assess a shuffled set of segmentations produced by multiple human annotators and/or automated models. This allows for the unbiased evaluation of model performance and helps uncover patterns of disagreement that may indicate systematic issues with either human or machine annotations. The primary objective of this study is to introduce and demonstrate this validation strategy as a generalizable framework for robust model evaluation in subjective segmentation tasks. We illustrate its practical implementation in a mammography use case involving dense tissue segmentation while emphasizing its potential applicability to a broad range of segmentation scenarios.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 5","pages":""},"PeriodicalIF":2.7,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12113085/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144152252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yu Bai, Li Li, Shanqing Zhang, Jianfeng Lu, Ting Luo
{"title":"IEWNet: Multi-Scale Robust Watermarking Network Against Infrared Image Enhancement Attacks.","authors":"Yu Bai, Li Li, Shanqing Zhang, Jianfeng Lu, Ting Luo","doi":"10.3390/jimaging11050171","DOIUrl":"10.3390/jimaging11050171","url":null,"abstract":"<p><p>Infrared (IR) images record the temperature radiation distribution of the object being captured. The hue and color difference in the image reflect the caloric and temperature difference, respectively. However, due to the thermal diffusion effect, the target information in IR images can be relatively large and the objects' boundaries are blurred. Therefore, IR images may undergo some image enhancement operations prior to use in relevant application scenarios. Furthermore, Infrared Enhancement (IRE) algorithms have a negative impact on the watermarking information embedded into the IR image in most cases. In this paper, we propose a novel multi-scale robust watermarking model under IRE attack, called IEWNet. This model trains a preprocessing module for extracting image features based on the conventional Undecimated Dual Tree Complex Wavelet Transform (UDTCWT). Furthermore, we consider developing a noise layer with a focus on four deep learning and eight classical attacks, and all of these attacks are based on IRE algorithms. Moreover, we add a noise layer or an enhancement module between the encoder and decoder according to the application scenarios. The results of the imperceptibility experiments on six public datasets prove that the Peak Signal to Noise Ratio (PSNR) is usually higher than 40 dB. The robustness of the algorithms is also better than the existing state-of-the-art image watermarking algorithms used in the performance evaluation comparison.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 5","pages":""},"PeriodicalIF":2.7,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12113104/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144152162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Recovery and Characterization of Tissue Properties from Magnetic Resonance Fingerprinting with Exchange.","authors":"Naren Nallapareddy, Soumya Ray","doi":"10.3390/jimaging11050169","DOIUrl":"10.3390/jimaging11050169","url":null,"abstract":"<p><p>Magnetic resonance fingerprinting (MRF), a quantitative MRI technique, enables the acquisition of multiple tissue properties in a single scan. In this paper, we study a proposed extension of MRF, MRF with exchange (MRF-X), which can enable acquisition of the six tissue properties T1a,T2a, T1b, T2b, ρ and τ simultaneously. In MRF-X, 'a' and 'b' refer to distinct compartments modeled in each voxel, while ρ is the fractional volume of component 'a', and τ is the exchange rate of protons between the two components. To assess the feasibility of recovering these properties, we first empirically characterize a similarity metric between MRF and MRF-X reconstructed tissue property values and known reference property values for candidate signals. Our characterization indicates that such a recovery is possible, although the similarity metric surface across the candidate tissue properties is less structured for MRF-X than for MRF. We then investigate the application of different optimization techniques to recover tissue properties from noisy MRF and MRF-X data. Previous work has widely utilized template dictionary-based approaches in the context of MRF; however, such approaches are infeasible with MRF-X. Our results show that Simplicial Homology Global Optimization (SHGO), a global optimization algorithm, and Limited-memory Bryoden-Fletcher-Goldfarb-Shanno algorithm with Bounds (L-BFGS-B), a local optimization algorithm, performed comparably with direct matching in two-tissue property MRF at an SNR of 5. These optimization methods also successfully recovered five tissue properties from MRF-X data. However, with the current pulse sequence and reconstruction approach, recovering all six tissue properties remains challenging for all the methods investigated.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 5","pages":""},"PeriodicalIF":2.7,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12113007/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144152241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Soroush Oskouei, Marit Valla, André Pedersen, Erik Smistad, Vibeke Grotnes Dale, Maren Høibø, Sissel Gyrid Freim Wahl, Mats Dehli Haugum, Thomas Langø, Maria Paula Ramnefjell, Lars Andreas Akslen, Gabriel Kiss, Hanne Sorger
{"title":"Segmentation of Non-Small Cell Lung Carcinomas: Introducing DRU-Net and Multi-Lens Distortion.","authors":"Soroush Oskouei, Marit Valla, André Pedersen, Erik Smistad, Vibeke Grotnes Dale, Maren Høibø, Sissel Gyrid Freim Wahl, Mats Dehli Haugum, Thomas Langø, Maria Paula Ramnefjell, Lars Andreas Akslen, Gabriel Kiss, Hanne Sorger","doi":"10.3390/jimaging11050166","DOIUrl":"10.3390/jimaging11050166","url":null,"abstract":"<p><p>The increased workload in pathology laboratories today means automated tools such as artificial intelligence models can be useful, helping pathologists with their tasks. In this paper, we propose a segmentation model (DRU-Net) that can provide a delineation of human non-small cell lung carcinomas and an augmentation method that can improve classification results. The proposed model is a fused combination of truncated pre-trained DenseNet201 and ResNet101V2 as a patch-wise classifier, followed by a lightweight U-Net as a refinement model. Two datasets (Norwegian Lung Cancer Biobank and Haukeland University Lung Cancer cohort) were used to develop the model. The DRU-Net model achieved an average of 0.91 Dice similarity coefficient. The proposed spatial augmentation method (multi-lens distortion) improved the Dice similarity coefficient from 0.88 to 0.91. Our findings show that selecting image patches that specifically include regions of interest leads to better results for the patch-wise classifier compared to other sampling methods. A qualitative analysis by pathology experts showed that the DRU-Net model was generally successful in tumor detection. Results in the test set showed some areas of false-positive and false-negative segmentation in the periphery, particularly in tumors with inflammatory and reactive changes. In summary, the presented DRU-Net model demonstrated the best performance on the segmentation task, and the proposed augmentation technique proved to improve the results.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 5","pages":""},"PeriodicalIF":2.7,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12112506/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144152247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Creation of Artificial Data for Training a Neural Network Using the Example of a Conveyor Production Line for Flooring.","authors":"Alexey Zaripov, Roman Kulshin, Anatoly Sidorov","doi":"10.3390/jimaging11050168","DOIUrl":"10.3390/jimaging11050168","url":null,"abstract":"<p><p>This work is dedicated to the development of a system for generating artificial data for training neural networks used within a conveyor-based technology framework. It presents an overview of the application areas of computer vision (CV) and establishes that traditional methods of data collection and annotation-such as video recording and manual image labeling-are associated with high time and financial costs, which limits their efficiency. In this context, synthetic data represents an alternative capable of significantly reducing the time and financial expenses involved in forming training datasets. Modern methods for generating synthetic images using various tools-from game engines to generative neural networks-are reviewed. As a tool-platform solution, the concept of digital twins for simulating technological processes was considered, within which synthetic data is utilized. Based on the review findings, a generalized model for synthetic data generation was proposed and tested on the example of quality control for floor coverings on a conveyor line. The developed system provided the generation of photorealistic and diverse images suitable for training neural network models. A comparative analysis showed that the YOLOv8 model trained on synthetic data significantly outperformed the model trained on real images: the mAP50 metric reached 0.95 versus 0.36, respectively. This result demonstrates the high adequacy of the model built on the synthetic dataset and highlights the potential of using synthetic data to improve the quality of computer vision models when access to real data is limited.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 5","pages":""},"PeriodicalIF":2.7,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12112862/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144152251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eduardo Galicia Gómez, Fabián Torres-Robles, Jorge Perez-Gonzalez, Fernando Arámbula Cosío
{"title":"\"ShapeNet\": A Shape Regression Convolutional Neural Network Ensemble Applied to the Segmentation of the Left Ventricle in Echocardiography.","authors":"Eduardo Galicia Gómez, Fabián Torres-Robles, Jorge Perez-Gonzalez, Fernando Arámbula Cosío","doi":"10.3390/jimaging11050165","DOIUrl":"10.3390/jimaging11050165","url":null,"abstract":"<p><p>Left ventricle (LV) segmentation is crucial for cardiac diagnosis but remains challenging in echocardiography. We present ShapeNet, a fully automatic method combining a convolutional neural network (CNN) ensemble with an improved active shape model (ASM). ShapeNet predicts optimal pose (rotation, translation, and scale) and shape parameters, which are refined using the improved ASM. The ASM optimizes an objective function constructed from gray-level profiles concatenated into a single contour appearance vector. The model was trained on 4800 augmented CAMUS images and tested on both CAMUS and EchoNet databases. It achieved a Dice coefficient of 0.87 and a Hausdorff Distance (HD) of 4.08 pixels on CAMUS, and a Dice coefficient of 0.81 with an HD of 10.21 pixels on EchoNet, demonstrating robust performance across datasets. These results highlight the improved accuracy in HD compared to previous semantic and shape-based segmentation methods by generating statistically valid LV contours from ultrasound images.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 5","pages":""},"PeriodicalIF":2.7,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12112286/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144152216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shanquan Ying, Jianfeng Zhao, Guannan Li, Junjie Dai
{"title":"LIM: Lightweight Image Local Feature Matching.","authors":"Shanquan Ying, Jianfeng Zhao, Guannan Li, Junjie Dai","doi":"10.3390/jimaging11050164","DOIUrl":"10.3390/jimaging11050164","url":null,"abstract":"<p><p>Image matching is a fundamental problem in computer vision, serving as a core component in tasks such as visual localization, structure from motion, and SLAM. While recent advances using convolutional neural networks and transformer have achieved impressive accuracy, their substantial computational demands hinder practical deployment on resource-constrained devices, such as mobile and embedded platforms. To address this challenge, we propose LIM, a lightweight image local feature matching network designed for computationally constrained embedded systems. LIM integrates efficient feature extraction and matching modules that significantly reduce model complexity while maintaining competitive performance. Our design emphasizes robustness to extreme viewpoint and rotational variations, making it suitable for real-world deployment scenarios. Extensive experiments on multiple benchmarks demonstrate that LIM achieves a favorable trade-off between speed and accuracy, running more than 3× faster than existing deep matching methods, while preserving high-quality matching results. These characteristics position LIM as an effective solution for real-time applications in power-limited environments.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 5","pages":""},"PeriodicalIF":2.7,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12112725/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144152190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparing Geodesic Filtering to State-of-the-Art Algorithms: A Comprehensive Study and CUDA Implementation.","authors":"Pierre Boulanger, Sadid Bin Hasan","doi":"10.3390/jimaging11050167","DOIUrl":"10.3390/jimaging11050167","url":null,"abstract":"<p><p>This paper presents a comprehensive investigation into advanced image processing using geodesic filtering within a Riemannian manifold framework. We introduce a novel geodesic filtering formulation that uniquely integrates spatial and intensity relationships through minimal path computation, demonstrating significant improvements in edge preservation and noise reduction compared to conventional methods. Our quantitative analysis using peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) metrics across diverse image types reveals that our approach outperforms traditional techniques in preserving fine details while effectively suppressing both Gaussian and non-Gaussian noise. We developed an automatic parameter optimization methodology that eliminates manual tuning by identifying optimal filtering parameters based on image characteristics. Additionally, we present a highly optimized GPU implementation featuring innovative wave-propagation algorithms and memory access optimization techniques that achieve a 200× speedup, making geodesic filtering practical for real-time applications. Our work bridges the gap between theoretical elegance and computational practicality, establishing geodesic filtering as a superior solution for challenging image processing tasks in fields ranging from medical imaging to remote sensing.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 5","pages":""},"PeriodicalIF":2.7,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12112283/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144152164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Lightweight Semantic Segmentation Model for Underwater Images Based on DeepLabv3.","authors":"Chongjing Xiao, Zhiyu Zhou, Yanjun Hu","doi":"10.3390/jimaging11050162","DOIUrl":"10.3390/jimaging11050162","url":null,"abstract":"<p><p>Underwater object image processing is a crucial technology for marine environmental exploration. The complexity of marine environments typically results in underwater object images exhibiting color deviation, imbalanced contrast, and blurring. Existing semantic segmentation methods for underwater objects either suffer from low segmentation accuracy or fail to meet the lightweight requirements of underwater hardware. To address these challenges, this study proposes a lightweight semantic segmentation model based on DeepLabv3+. The framework employs MobileOne-S0 as the lightweight backbone for feature extraction, integrates Simple, Parameter-Free Attention Module (SimAM) into deep feature layers, replaces global average pooling in the Atrous Spatial Pyramid Pooling (ASPP) module with strip pooling, and adopts a content-guided attention (CGA)-based mixup fusion scheme to effectively combine high-level and low-level features while minimizing parameter redundancy. Experimental results demonstrate that the proposed model achieves a mean Intersection over Union (mIoU) of 71.18% on the DUT-USEG dataset, with parameters and computational complexity reduced to 6.628 M and 39.612 G FLOPs, respectively. These advancements significantly enhance segmentation accuracy while maintaining model efficiency, making the model highly suitable for resource-constrained underwater applications.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 5","pages":""},"PeriodicalIF":2.7,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12112063/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144152250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improved Face Image Super-Resolution Model Based on Generative Adversarial Network.","authors":"Qingyu Liu, Yeguo Sun, Lei Chen, Lei Liu","doi":"10.3390/jimaging11050163","DOIUrl":"10.3390/jimaging11050163","url":null,"abstract":"<p><p>Image super-resolution (SR) models based on the generative adversarial network (GAN) face challenges such as unnatural facial detail restoration and local blurring. This paper proposes an improved GAN-based model to address these issues. First, a Multi-scale Hybrid Attention Residual Block (MHARB) is designed, which dynamically enhances feature representation in critical face regions through dual-branch convolution and channel-spatial attention. Second, an Edge-guided Enhancement Block (EEB) is introduced, generating adaptive detail residuals by combining edge masks and channel attention to accurately recover high-frequency textures. Furthermore, a multi-scale discriminator with a weighted sub-discriminator loss is developed to balance global structural and local detail generation quality. Additionally, a phase-wise training strategy with dynamic adjustment of learning rate (Lr) and loss function weights is implemented to improve the realism of super-resolved face images. Experiments on the CelebA-HQ dataset demonstrate that the proposed model achieves a PSNR of 23.35 dB, a SSIM of 0.7424, and a LPIPS of 24.86, outperforming classical models and delivering superior visual quality in high-frequency regions. Notably, this model also surpasses the SwinIR model (PSNR: 23.28 dB → 23.35 dB, SSIM: 0.7340 → 0.7424, and LPIPS: 30.48 → 24.86), validating the effectiveness of the improved model and the training strategy in preserving facial details.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 5","pages":""},"PeriodicalIF":2.7,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12112315/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144152185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}