Journal of Imaging最新文献

筛选
英文 中文
A Method for Paired Comparisons of Glo Germ Quantity in Images of Hands Before and After Washing. 洗手前后双手图像中Glo细菌数量的配对比较方法。
IF 2.7
Journal of Imaging Pub Date : 2026-04-21 DOI: 10.3390/jimaging12040178
Jordan Ali Rashid, Stuart Criley
{"title":"A Method for Paired Comparisons of Glo Germ Quantity in Images of Hands Before and After Washing.","authors":"Jordan Ali Rashid, Stuart Criley","doi":"10.3390/jimaging12040178","DOIUrl":"https://doi.org/10.3390/jimaging12040178","url":null,"abstract":"<p><p>We present a reproducible pipeline that converts color images into quantitative fluorescence maps by combining spectral measurement with a linear mixture model. The method is designed specifically for quantitative comparisons of Glo Germ™ on images of hands taken under different experimental conditions with controlled illumination. The emission spectrum of Glo Germ is measured using a spectral photometer and normalized to obtain its spectral power density function. This spectrum is projected into CIE XYZ coordinates and incorporated into a linear mixture model in which each pixel contains contributions from white light, UV-illuminated skin reflectance, and fluorophore emission. Component magnitudes are estimated with non-negative least squares, yielding a grayscale image whose intensity is a monotonic proxy for local fluorophore density. Spatial integration provides an image-level summary proportional to total detected material. Compared with single-channel proxies, the observer suppresses background structure, improves contrast, and remains radiometrically interpretable. Because the method depends only on measurable spectra and linear transforms, it can be reproduced across cameras and extended to other fluorophores.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"12 4","pages":""},"PeriodicalIF":2.7,"publicationDate":"2026-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13117818/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147783757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated Morphological Profiling via Deep Learning-Based Segmentation for High-Throughput Phenotypic Screening. 基于深度学习的高通量表型筛选分割的自动形态学分析。
IF 2.7
Journal of Imaging Pub Date : 2026-04-21 DOI: 10.3390/jimaging12040179
Bendegúz H Zováthi, Philipp Kainz
{"title":"Automated Morphological Profiling via Deep Learning-Based Segmentation for High-Throughput Phenotypic Screening.","authors":"Bendegúz H Zováthi, Philipp Kainz","doi":"10.3390/jimaging12040179","DOIUrl":"https://doi.org/10.3390/jimaging12040179","url":null,"abstract":"<p><p>Reproducible morphological profiling, particularly for drug discovery, has become an important tool for compound evaluation. Established workflows such as CellProfiler provide a widely adopted foundation for Cell Painting analysis. However, conventional pipelines often require substantial manual configuration and technical expertise, which can limit scalability and accessibility. In this study, a fully automated deep learning-based workflow is presented for segmentation-driven morphological profiling from raw microscopy data. Using a curated subset of the JUMP Cell Painting pilot dataset, ground-truth masks were generated and used to train a U-net-based segmentation model in the IKOSA platform. Post-processing strategies were introduced to improve instance separation and reduce segmentation artifacts. The final model achieved strong segmentation performance (precision/recall/AP up to 0.98/0.94/0.92 for nuclei), with an average runtime of 2.2 s per 1080 × 1080 image. Segmentation outputs enabled large-scale feature extraction, yielding 3664 morphological descriptors that showed high correlation with CellProfiler-derived measurements (normalized MAE: 0.0298). Feature prioritization further reduced redundancy to 1145 informative descriptors. These results demonstrate that automated deep learning pipelines can complement established Cell Painting workflows by reducing configuration overhead while maintaining compatibility with validated morphological profiling standards. The proposed workflow may help improve resource efficiency in drug discovery and personalized medicine.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"12 4","pages":""},"PeriodicalIF":2.7,"publicationDate":"2026-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13117058/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147783987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Video-Based Arabic Sign Language Recognition with Mediapipe and Deep Learning Techniques. 基于视频的阿拉伯手语识别与Mediapipe和深度学习技术。
IF 2.7
Journal of Imaging Pub Date : 2026-04-20 DOI: 10.3390/jimaging12040177
Dana El-Rushaidat, Nour Almohammad, Raine Yeh, Kinda Fayyad
{"title":"Video-Based Arabic Sign Language Recognition with Mediapipe and Deep Learning Techniques.","authors":"Dana El-Rushaidat, Nour Almohammad, Raine Yeh, Kinda Fayyad","doi":"10.3390/jimaging12040177","DOIUrl":"https://doi.org/10.3390/jimaging12040177","url":null,"abstract":"<p><p>This paper addresses the critical communication barrier experienced by deaf and hearing-impaired individuals in the Arab world through the development of an affordable, video-based Arabic Sign Language (ArSL) recognition system. Designed for broad accessibility, the system eliminates specialized hardware by leveraging standard mobile or laptop cameras. Our methodology employs Mediapipe for real-time extraction of hand, face, and pose landmarks from video streams. These anatomical features are then processed by a hybrid deep learning model integrating Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), specifically Bidirectional Long Short-Term Memory (BiLSTM) layers. The CNN component captures spatial features, such as intricate hand shapes and body movements, within individual frames. Concurrently, BiLSTMs model long-term temporal dependencies and motion trajectories across consecutive frames. This integrated CNN-BiLSTM architecture is critical for generating a comprehensive spatiotemporal representation, enabling accurate differentiation of complex signs where meaning relies on both static gestures and dynamic transitions, thus preventing misclassification that CNN-only or RNN-only models would incur. Rigorously evaluated on the author-created JUST-SL dataset and the publicly available KArSL dataset, the system achieved 96% overall accuracy for JUST-SL and an impressive 99% for KArSL. These results demonstrate the system's superior accuracy compared to previous research, particularly for recognizing full Arabic words, thereby significantly enhancing communication accessibility for the deaf and hearing-impaired community.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"12 4","pages":""},"PeriodicalIF":2.7,"publicationDate":"2026-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13117685/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147784016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MSWA-ResNet: Multi-Scale Wavelet Attention for Patient-Level and Interpretable Breast Cancer Histopathology Classification. MSWA-ResNet:多尺度小波关注用于患者水平和可解释的乳腺癌组织病理学分类。
IF 2.7
Journal of Imaging Pub Date : 2026-04-19 DOI: 10.3390/jimaging12040176
Ghadeer Al Sukkar, Ali Rodan, Azzam Sleit
{"title":"MSWA-ResNet: Multi-Scale Wavelet Attention for Patient-Level and Interpretable Breast Cancer Histopathology Classification.","authors":"Ghadeer Al Sukkar, Ali Rodan, Azzam Sleit","doi":"10.3390/jimaging12040176","DOIUrl":"https://doi.org/10.3390/jimaging12040176","url":null,"abstract":"<p><p>Breast cancer histopathological classification is critical for diagnosis and treatment planning, yet manual assessment remains time-consuming and subject to inter-observer variability. Although deep learning approaches have advanced automated analysis, image-level data splitting may introduce data leakage, and spatial-domain architectures lack explicit multi-scale frequency modeling. This study proposes MSWA-ResNet, a Multi-Scale Wavelet Attention Residual Network that embeds recursive discrete wavelet decomposition within residual blocks to enable frequency-aware and scale-aware feature learning. The model is evaluated on the BreakHis dataset using a strict patient-level protocol with 70/30 patient-wise splitting, five-fold stratified cross-validation, ensemble prediction, and hierarchical aggregation from patch to patient level. MSWA-ResNet achieves 96% patient-level accuracy at 100×, 200×, and 400× magnifications, and 92% at 40×, with F1-scores of 0.97 and 0.94, respectively. At 200× and 400×, accuracy improves from 0.92 to 0.96 and F1-score from 0.94 to 0.97 over baseline CNNs while maintaining 11.8-12.1 M parameters and 2.5-4.8 ms inference time. Grad-CAM demonstrates improved localization of diagnostically relevant regions, indicating that explicit multi-scale frequency modeling enhances accurate and interpretable patient-level classification.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"12 4","pages":""},"PeriodicalIF":2.7,"publicationDate":"2026-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13117778/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147783977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SurveyNet: A Unified Deep Learning Framework for OCR and OMR-Based Survey Digitization. 基于OCR和omr的调查数字化的统一深度学习框架。
IF 2.7
Journal of Imaging Pub Date : 2026-04-17 DOI: 10.3390/jimaging12040175
Rubi Quiñones, Sreeja Cheekireddy, Eren Gultepe
{"title":"SurveyNet: A Unified Deep Learning Framework for OCR and OMR-Based Survey Digitization.","authors":"Rubi Quiñones, Sreeja Cheekireddy, Eren Gultepe","doi":"10.3390/jimaging12040175","DOIUrl":"https://doi.org/10.3390/jimaging12040175","url":null,"abstract":"<p><p>Manual survey data entry remains a bottleneck in large-scale research, marketing, and public policy, where survey sheets are still widely used due to accessibility and high response rates. Despite the progress in Optical Character Recognition (OCR) and Optical Mark Recognition (OMR), existing systems treat these tasks separately and are typically tailored to clean, standardized forms, making them unreliable for real-world survey sheets with diverse markings and handwritten inputs. These limitations hinder automation and introduce significant error rates in data transcription. To address this, we propose SurveyNet, a unified deep learning framework that combines OCR and OMR capabilities to automatically digitize complex survey responses within a single model. SurveyNet processes both handwritten digits and a wide variety of mark types including ticks, circles, and crosses across multiple question formats. We also introduce SurveySet, a novel dataset comprising 135 real-world survey forms annotated across four key response types. Experimental results demonstrate that SurveyNet achieves between 50% and 97% classification accuracy across tasks, with strong performance even on small and imbalanced datasets. This framework offers a scalable solution for streamlining survey digitization workflows, reducing manual errors, and enabling timely analysis in domains ranging from consumer research to public health and education.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"12 4","pages":""},"PeriodicalIF":2.7,"publicationDate":"2026-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13117292/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147783974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cracking the Code: Computational Image Analysis Tools for Histopathological and Morphometric Insights. 破解密码:组织病理学和形态计量学见解的计算图像分析工具。
IF 2.7
Journal of Imaging Pub Date : 2026-04-17 DOI: 10.3390/jimaging12040173
Ana Luisa Teixeira de Almeida, Ana Beatriz Gram Dos Santos, Debora Ferreira Barreto-Vieira
{"title":"Cracking the Code: Computational Image Analysis Tools for Histopathological and Morphometric Insights.","authors":"Ana Luisa Teixeira de Almeida, Ana Beatriz Gram Dos Santos, Debora Ferreira Barreto-Vieira","doi":"10.3390/jimaging12040173","DOIUrl":"https://doi.org/10.3390/jimaging12040173","url":null,"abstract":"<p><p>The assessment of histopathological features has evolved considerably, transitioning from traditional manual measurements to more sophisticated, technology-assisted approaches. Classical histological evaluation, while foundational and highly reliable, is inherently labor-intensive and subject to inter-observer variability. With the advent of digital pathology, these practices have been progressively enhanced by image processing software, which offers capabilities such as segmentation, feature extraction, and data visualization. However, despite their promise, the integration of machine learning into this branch of pathology faces notable challenges, such as the need for large, high-quality annotated datasets and the integration into existing workflows, which remain significant hurdles. Looking forward, the role of specialists in histological evaluation remains crucial in this evolving landscape. While automation streamlines routine tasks, the expertise of pathologists is indispensable in validating results and interpreting findings in scientific contexts. This comprehensive review explores the trajectory of histological evaluation methods, from manual and classical strategies to cutting-edge digital tools, highlighting the benefits, limitations, and implications of each approach in contemporary practice.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"12 4","pages":""},"PeriodicalIF":2.7,"publicationDate":"2026-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13118002/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147784009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Vision-Based Measurement of Breathing Deformation in Wind Turbine Blade Fatigue Test. 风电叶片疲劳试验中呼吸变形的视觉测量。
IF 2.7
Journal of Imaging Pub Date : 2026-04-17 DOI: 10.3390/jimaging12040174
Xianlong Wei, Cailin Li, Zhiyong Wang, Zhao Hai, Jinghua Wang, Leian Zhang
{"title":"Vision-Based Measurement of Breathing Deformation in Wind Turbine Blade Fatigue Test.","authors":"Xianlong Wei, Cailin Li, Zhiyong Wang, Zhao Hai, Jinghua Wang, Leian Zhang","doi":"10.3390/jimaging12040174","DOIUrl":"https://doi.org/10.3390/jimaging12040174","url":null,"abstract":"<p><p>Wind turbine blades are subjected to complex environmental conditions during long-term operation, which may lead to structural degradation and performance loss. To ensure structural integrity, fatigue testing prior to deployment is essential. This paper proposes a vision-based method for measuring the full-cycle breathing deformation of wind turbine blades during fatigue testing. The method captures dynamic image sequences of the blade's hotspot cross-section using industrial cameras and employs a feature-based template matching approach to reconstruct the three-dimensional coordinates of target points. Through coordinate transformation, the deformation trajectories are obtained, enabling quantitative analysis of the blade's dynamic responses in both flapwise and edgewise directions. A dedicated hardware-software system was developed and validated through full-scale fatigue experiments. Quantitative comparison with strain gage measurements shows that the proposed method achieves mean absolute deviations of 0.84 mm and 0.93 mm in two independent experiments, respectively, with closely matched deformation trends under typical loading conditions. These results demonstrate that the proposed method can reliably capture the global deformation behavior of the blade with millimeter-level accuracy, while significantly reducing instrumentation complexity compared to conventional contact-based approaches. The proposed method provides an effective and practical solution for full-field dynamic deformation measurement in blade fatigue testing, offering strong potential for structural health monitoring and early damage detection in wind turbine systems.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"12 4","pages":""},"PeriodicalIF":2.7,"publicationDate":"2026-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13117450/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147784014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual RANSAC with Rescue Midpoint Multi-Trend Vanishing Point Detection. 具有救援中点多趋势消失点检测的双RANSAC。
IF 2.7
Journal of Imaging Pub Date : 2026-04-16 DOI: 10.3390/jimaging12040172
Nada Said, Bilal Nakhal, Ali El-Zaart, Lama Affara
{"title":"Dual RANSAC with Rescue Midpoint Multi-Trend Vanishing Point Detection.","authors":"Nada Said, Bilal Nakhal, Ali El-Zaart, Lama Affara","doi":"10.3390/jimaging12040172","DOIUrl":"https://doi.org/10.3390/jimaging12040172","url":null,"abstract":"<p><p>Vanishing point detection is a fundamental step in computer vision that allows 3D scene understanding and autonomous navigation. Classical techniques have significant challenges when trying to understand scenes that are heavily cluttered and images containing multiple perspective cues, leading to poor or unreliable vanishing point determination. We present a Dual RANSAC with Rescue Midpoint-based Multi-Trend Vanishing Point Detection framework, which targets the simultaneous detection and fine-tuning of multiple, globally consistent vanishing points. The proposed framework introduces a novel Midpoint-based Multi-Trend Random Sample Consensus formulation that operates on line segment midpoints to infer dominant directional groups, thereby eliminating noisy or unstable midpoints and stabilizing subsequent vanishing point inference. The main novelty lies in using line segment midpoints to model the orientation variation as a linear regression in the midpoint-orientation space, which helps reduce sensitivity to endpoint instability. Candidate vanishing points are prioritized through inlier-based confidence ranking and subsequently optimized via an MSAC-based arbiter to resolve hypothesis conflicts and minimize geometric error. We evaluate our work against state-of-the-art techniques such as J-Linkage and Conditional Sample Consensus, over two of the current challenging public datasets that comprise the York Urban Dataset and the Toulouse Vanishing Point Dataset. The results show that the proposed framework achieves a recall of up to 95% and an image success rate of almost 84%, outperforming both J-Linkage and Conditional Sample Consensus, especially under tighter angular thresholds. This demonstrates the ability of the proposed framework to provide enhanced stability and localization accuracy.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"12 4","pages":""},"PeriodicalIF":2.7,"publicationDate":"2026-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13118200/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147784006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ARS-GS: Anisotropic Reflective Spherical 3D Gaussian Splatting. ARS-GS:各向异性反射球面三维高斯溅射。
IF 2.7
Journal of Imaging Pub Date : 2026-04-15 DOI: 10.3390/jimaging12040170
Chenrui Wu, Xinyu Shi, Zhenzhong Chu, Yao Huang
{"title":"ARS-GS: Anisotropic Reflective Spherical 3D Gaussian Splatting.","authors":"Chenrui Wu, Xinyu Shi, Zhenzhong Chu, Yao Huang","doi":"10.3390/jimaging12040170","DOIUrl":"https://doi.org/10.3390/jimaging12040170","url":null,"abstract":"<p><p>3D scene reconstruction serves as a fundamental technology with widespread applications in virtual reality, structural inspection, and robotic systems. While recent advances in 3D Gaussian Splatting have significantly enhanced scene reconstruction capabilities, the performance of such methods remains suboptimal when applied to highly reflective environments. To overcome this limitation, we introduce ARS-GS, a novel framework that integrates Anisotropic Spherical Gaussian reflection modeling and spherical harmonics diffuse approximation into a physically based rendering pipeline. This architecture incorporates a skip connection between the Anisotropic Spherical Gaussian module and the Gaussian primitives, effectively preserving surface details while maintaining computational efficiency. Comprehensive experimental evaluations validate the efficacy of ARS-GS across multiple datasets. Specifically, our method establishes new state-of-the-art quantitative benchmarks, achieving a peak signal-to-noise ratio of 38.30 and a structural similarity index measure of 0.997 on the neural radiance fields synthetic dataset, alongside a peak signal-to-noise ratio of 46.31 on the Gloss Blender dataset. Furthermore, on the challenging reflective neural radiance fields real-world dataset, our approach secures the highest peak signal-to-noise ratio scores, highlighted by a metric of 26.26 on the Sedan scene. The proposed method also substantially reduces perceptual errors, yielding a learned perceptual image patch similarity as low as 0.204, thereby consistently outperforming existing techniques in the reconstruction of highly specular surfaces with superior geometric fidelity.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"12 4","pages":""},"PeriodicalIF":2.7,"publicationDate":"2026-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13117809/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147783967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Morphological Convolutional Neural Network for Efficient Facial Expression Recognition. 高效面部表情识别的形态卷积神经网络。
IF 2.7
Journal of Imaging Pub Date : 2026-04-15 DOI: 10.3390/jimaging12040171
Robert, Sarifuddin Madenda, Suryadi Harmanto, Michel Paindavoine, Dina Indarti
{"title":"Morphological Convolutional Neural Network for Efficient Facial Expression Recognition.","authors":"Robert, Sarifuddin Madenda, Suryadi Harmanto, Michel Paindavoine, Dina Indarti","doi":"10.3390/jimaging12040171","DOIUrl":"https://doi.org/10.3390/jimaging12040171","url":null,"abstract":"<p><p>This study proposes a morphological convolutional neural network (MCNN) architecture that integrates morphological operations with CNN layers for facial expression recognition (FER). Conventional CNN-based FER models primarily rely on appearance features and may be sensitive to illumination and demographic variations. This work investigates whether morphological structural representations provide complementary information to convolutional features. A multi-source and multi-ethnic FER dataset was constructed by combining CK+, JAFFE, KDEF, TFEID, and a newly collected Indonesian Facial Expression dataset, resulting in 3684 images from 326 subjects across seven expression classes. Subject-independent data splitting with 10-fold cross-validation was applied to ensure reliable evaluation. Experimental results show that the proposed MCNN1 model achieves an average accuracy of 88.16%, while the best MCNN2 variant achieves 88.7%, demonstrating competitive performance compared to MobileNetV2 (88.27%), VGG19 (87.58%), and the morphological baseline MNN (50.73%). The proposed model also demonstrates improved computational efficiency, achieving lower inference latency (21%) and reduced GPU memory usage (64%) compared to baseline models. These results indicate that integrating morphological representations into convolutional architectures provides a modest but consistent improvement in FER performance while enhancing generalization and efficiency under heterogeneous data conditions.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"12 4","pages":""},"PeriodicalIF":2.7,"publicationDate":"2026-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13117524/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147784036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书