A Deep Learning-based Ensemble Model for Automated Nasolabial Fold Severity Grading.

IF 3 2区医学 Q1 SURGERY

Aesthetic Surgery Journal Pub Date : 2025-08-13 DOI:10.1093/asj/sjaf161

Hengqing Cui, Ziqi Zhang, Wenjun Zhang, Jun Zhang, Haiyan Cui

{"title":"A Deep Learning-based Ensemble Model for Automated Nasolabial Fold Severity Grading.","authors":"Hengqing Cui, Ziqi Zhang, Wenjun Zhang, Jun Zhang, Haiyan Cui","doi":"10.1093/asj/sjaf161","DOIUrl":null,"url":null,"abstract":"Background: Nasolabial fold (NLF) severity is a key indicator of facial aging and a frequent target in aesthetic treatments. The Wrinkle Severity Rating Scale (WSRS) is widely used for clinical grading but remains inherently subjective and vulnerable to inter-observer variability.Objectives: This study aimed to develop and validate DeepFold, a deep learning-based ensemble model for automated, objective, and clinically interpretable grading of NLF severity based on the WSRS.Methods: A dataset of 6,718 facial images was constructed, including 1,718 images from clinical outpatients and 5,000 from the CelebA dataset. All images were split into left and right halves and annotated independently by three senior plastic surgeons using the WSRS. ResNet-50 served as the base model architecture, and an ensemble strategy was applied using majority voting over three independently trained networks. Model training used focal loss to address class imbalance and was conducted in PyTorch with early stopping based on validation loss. Performance was assessed using accuracy, F1-score, and confusion matrix analysis.Results: The DeepFold ensemble model achieved a validation accuracy and F1-score of 0.917, outperforming individual baseline models such as ResNet-50 (accuracy: 0.904) and SeResNet-50 (accuracy: 0.882). Ensemble strategies reduced prediction variance and enhanced model robustness under class imbalance.Conclusions: DeepFold provides a reliable and standardized approach to NLF severity assessment, offering potential clinical value in aesthetic evaluation, treatment planning, and outcome monitoring.","PeriodicalId":7728,"journal":{"name":"Aesthetic Surgery Journal","volume":" ","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Aesthetic Surgery Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/asj/sjaf161","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SURGERY","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Nasolabial fold (NLF) severity is a key indicator of facial aging and a frequent target in aesthetic treatments. The Wrinkle Severity Rating Scale (WSRS) is widely used for clinical grading but remains inherently subjective and vulnerable to inter-observer variability.

Objectives: This study aimed to develop and validate DeepFold, a deep learning-based ensemble model for automated, objective, and clinically interpretable grading of NLF severity based on the WSRS.

Methods: A dataset of 6,718 facial images was constructed, including 1,718 images from clinical outpatients and 5,000 from the CelebA dataset. All images were split into left and right halves and annotated independently by three senior plastic surgeons using the WSRS. ResNet-50 served as the base model architecture, and an ensemble strategy was applied using majority voting over three independently trained networks. Model training used focal loss to address class imbalance and was conducted in PyTorch with early stopping based on validation loss. Performance was assessed using accuracy, F1-score, and confusion matrix analysis.

Results: The DeepFold ensemble model achieved a validation accuracy and F1-score of 0.917, outperforming individual baseline models such as ResNet-50 (accuracy: 0.904) and SeResNet-50 (accuracy: 0.882). Ensemble strategies reduced prediction variance and enhanced model robustness under class imbalance.

Conclusions: DeepFold provides a reliable and standardized approach to NLF severity assessment, offering potential clinical value in aesthetic evaluation, treatment planning, and outcome monitoring.

查看原文本刊更多论文

基于深度学习的鼻唇襞严重程度自动分级集成模型。

鼻唇沟（NLF）严重程度是面部老化的关键指标，也是美容治疗的常见目标。皱纹严重性评定量表（WSRS）被广泛用于临床分级，但仍然具有固有的主观性和易受观察者之间的可变性。目的：本研究旨在开发和验证DeepFold，这是一种基于深度学习的集成模型，用于基于WSRS的NLF严重程度的自动、客观和临床可解释的分级。方法：构建6718张面部图像数据集，其中1718张来自临床门诊患者，5000张来自CelebA数据集。所有图像被分成左右两半，并由三位高级整形外科医生使用wrs独立注释。ResNet-50作为基本模型架构，并在三个独立训练的网络上使用多数投票应用集成策略。模型训练使用焦点损失来解决类不平衡问题，并在PyTorch中进行，基于验证损失提前停止。使用准确性、f1评分和混淆矩阵分析来评估性能。结果：DeepFold集成模型的验证精度和f1评分为0.917，优于ResNet-50（准确率为0.904）和SeResNet-50（准确率为0.882）等单个基线模型。集成策略降低了预测方差，增强了类不平衡下的模型稳健性。结论：DeepFold为NLF严重程度评估提供了可靠和标准化的方法，在美学评估、治疗计划和结果监测方面具有潜在的临床价值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Aesthetic Surgery Journal SURGERY-

CiteScore

6.20

自引率

20.70%

发文量

309

审稿时长

6-12 weeks

期刊介绍： Aesthetic Surgery Journal is a peer-reviewed international journal focusing on scientific developments and clinical techniques in aesthetic surgery. The official publication of The Aesthetic Society, ASJ is also the official English-language journal of many major international societies of plastic, aesthetic and reconstructive surgery representing South America, Central America, Europe, Asia, and the Middle East. It is also the official journal of the British Association of Aesthetic Plastic Surgeons, the Canadian Society for Aesthetic Plastic Surgery and The Rhinoplasty Society.