Development and evaluation of a convolutional neural network model for sex prediction using cephalometric radiographs and cranial photographs.

IF 3.2 3区医学 Q2 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

BMC Medical Imaging Pub Date : 2025-08-25 DOI:10.1186/s12880-025-01892-x

Vitria Wuri Handayani, Mieke Sylvia Margareth Amiatun Ruth, Riries Rulaningtyas, Muhammad Rasyad Caesarardhi, Bayu Azra Yudhantorro, Ahmad Yudianto

{"title":"Development and evaluation of a convolutional neural network model for sex prediction using cephalometric radiographs and cranial photographs.","authors":"Vitria Wuri Handayani, Mieke Sylvia Margareth Amiatun Ruth, Riries Rulaningtyas, Muhammad Rasyad Caesarardhi, Bayu Azra Yudhantorro, Ahmad Yudianto","doi":"10.1186/s12880-025-01892-x","DOIUrl":null,"url":null,"abstract":"Background: Accurately determining sex using features like facial bone profiles and teeth is crucial for identifying unknown victims. Lateral cephalometric radiographs effectively depict the lateral cranial structure, aiding the development of computational identification models.Objective: This study develops and evaluates a sex prediction model using cephalometric radiographs with several convolutional neural network (CNN) architectures. The primary goal is to evaluate the model's performance on standardized radiographic data and real-world cranial photographs to simulate forensic applications.Methods: Six CNN architectures-VGG16, VGG19, MobileNetV2, ResNet50V2, InceptionV3, and InceptionResNetV2-were employed to train and validate 340 cephalometric images of Indonesian individuals aged 18 to 40 years. The data were divided into training (70%), validation (15%), and testing (15%) subsets. Data augmentation was implemented to mitigate class imbalance. Additionally, a set of 40 cranial images from anatomical specimens was employed to evaluate the model's generalizability. Model performance metrics included accuracy, precision, recall, and F1-score.Results: CNN models were trained and evaluated on 340 cephalometric images (255 females and 85 males). VGG19 and ResNet50V2 achieved high F1-scores of 95% (females) and 83% (males), respectively, using cephalometric data, highlighting their strong class-specific performance. Although the overall accuracy exceeded 90%, the F1-score better reflected model performance in this imbalanced dataset. In contrast, performance notably decreased with cranial photographs, particularly when classifying female samples. That is, while InceptionResNetV2 achieved the highest F1-score for cranial photographs (62%), misclassification of females remained significant. Confusion matrices and per-class metrics further revealed persistent issues related to data imbalance and generalization across imaging modalities.Conclusions: Basic CNN models perform well on standardized cephalometric images but less effectively on photographic cranial images, indicating a domain shift between image types that limits generalizability. Improving real-world forensic performance will require further optimization and more diverse training data.Clinical trial number: Not applicable.","PeriodicalId":9020,"journal":{"name":"BMC Medical Imaging","volume":"25 1","pages":"348"},"PeriodicalIF":3.2000,"publicationDate":"2025-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12379395/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Imaging","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12880-025-01892-x","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Accurately determining sex using features like facial bone profiles and teeth is crucial for identifying unknown victims. Lateral cephalometric radiographs effectively depict the lateral cranial structure, aiding the development of computational identification models.

Objective: This study develops and evaluates a sex prediction model using cephalometric radiographs with several convolutional neural network (CNN) architectures. The primary goal is to evaluate the model's performance on standardized radiographic data and real-world cranial photographs to simulate forensic applications.

Methods: Six CNN architectures-VGG16, VGG19, MobileNetV2, ResNet50V2, InceptionV3, and InceptionResNetV2-were employed to train and validate 340 cephalometric images of Indonesian individuals aged 18 to 40 years. The data were divided into training (70%), validation (15%), and testing (15%) subsets. Data augmentation was implemented to mitigate class imbalance. Additionally, a set of 40 cranial images from anatomical specimens was employed to evaluate the model's generalizability. Model performance metrics included accuracy, precision, recall, and F1-score.

Results: CNN models were trained and evaluated on 340 cephalometric images (255 females and 85 males). VGG19 and ResNet50V2 achieved high F1-scores of 95% (females) and 83% (males), respectively, using cephalometric data, highlighting their strong class-specific performance. Although the overall accuracy exceeded 90%, the F1-score better reflected model performance in this imbalanced dataset. In contrast, performance notably decreased with cranial photographs, particularly when classifying female samples. That is, while InceptionResNetV2 achieved the highest F1-score for cranial photographs (62%), misclassification of females remained significant. Confusion matrices and per-class metrics further revealed persistent issues related to data imbalance and generalization across imaging modalities.

Conclusions: Basic CNN models perform well on standardized cephalometric images but less effectively on photographic cranial images, indicating a domain shift between image types that limits generalizability. Improving real-world forensic performance will require further optimization and more diverse training data.

Clinical trial number: Not applicable.

Abstract Image

查看原文本刊更多论文

利用头颅x光片和颅骨照片进行性别预测的卷积神经网络模型的发展和评估。

背景：利用面部骨骼轮廓和牙齿等特征准确判断性别对于识别未知受害者至关重要。侧位头颅x线片有效地描绘了侧位颅骨结构，有助于计算识别模型的发展。目的：本研究利用几种卷积神经网络（CNN）架构开发并评估了一种基于头颅x线片的性别预测模型。主要目标是评估该模型在标准化放射照相数据和真实世界颅骨照片上的性能，以模拟法医应用。方法：采用vgg16、VGG19、MobileNetV2、ResNet50V2、InceptionV3和inceptionresnetv2 6个CNN架构，对印尼18 ~ 40岁人群的340张头测图像进行训练和验证。数据被分为训练（70%）、验证（15%）和测试（15%）三个子集。数据增强是为了缓解类的不平衡。此外，我们还使用了一组来自解剖标本的40张颅骨图像来评估该模型的通用性。模型性能指标包括准确性、精密度、召回率和f1分数。结果：CNN模型在340张头颅测量图像（女性255张，男性85张）上进行了训练和评估。根据头部测量数据，VGG19和ResNet50V2分别获得了95%（女性）和83%（男性）的高f1评分，突出了它们强大的类别特异性表现。虽然整体准确率超过90%，但f1得分更好地反映了模型在这种不平衡数据集中的性能。相比之下，颅骨照片的表现明显下降，尤其是在对女性样本进行分类时。也就是说，虽然InceptionResNetV2在颅骨照片上获得了最高的f1分（62%），但女性的错误分类仍然很明显。混淆矩阵和每类指标进一步揭示了与数据不平衡和跨成像模式泛化相关的持续问题。结论：基本的CNN模型在标准化的头部测量图像上表现良好，但在摄影颅骨图像上表现不佳，表明图像类型之间的域转移限制了可泛化性。提高现实世界的取证性能需要进一步优化和更多样化的训练数据。临床试验号：不适用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

BMC Medical Imaging RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING-

CiteScore

4.60

自引率

3.70%

发文量

198

审稿时长

27 weeks

期刊介绍： BMC Medical Imaging is an open access journal publishing original peer-reviewed research articles in the development, evaluation, and use of imaging techniques and image processing tools to diagnose and manage disease.