Machine Learning for Predicting Malignant Transformation in Actinic Cheilitis: A Prognostic Support System Based on Demographic and Clinical Descriptors.
Ivan José Correia-Neto, Alex Franco da Costa, Anna Luíza Damaceno Araújo, Cristina Saldivia-Siracusa, Raísa Sales de Sá, Thiago Martini Pereira, Pablo Agustin Vargas, Alan Roger Santos-Silva, Luiz Paulo Kowalski, Matheus Cardoso Moraes, Marcio Ajudarte Lopes
{"title":"Machine Learning for Predicting Malignant Transformation in Actinic Cheilitis: A Prognostic Support System Based on Demographic and Clinical Descriptors.","authors":"Ivan José Correia-Neto, Alex Franco da Costa, Anna Luíza Damaceno Araújo, Cristina Saldivia-Siracusa, Raísa Sales de Sá, Thiago Martini Pereira, Pablo Agustin Vargas, Alan Roger Santos-Silva, Luiz Paulo Kowalski, Matheus Cardoso Moraes, Marcio Ajudarte Lopes","doi":"10.1111/jop.70113","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>This study aimed to develop and evaluate Machine Learning models to predict the malignant transformation (MT) in patients with actinic cheilitis (AC).</p><p><strong>Methods: </strong>Three hundred forty patients diagnosed with AC (322 in the no MT group, and 18 in the MT group) were carefully documented. The study used the Adaptive Synthetic Sampling to adaptively balance the dataset (322 in the no MT group and 319 in the MT group). Four supervised Machine Learning classifiers (Random Forest, Xtreme Gradient Boosting, Multilayer Perceptron, and Support Vector Machine) were trained and tested using 5-fold cross-validation to correlate inputs (clinical descriptors and demographic data) to outputs (MT). SHAP values were used to identify the most influential predictors of MT.</p><p><strong>Results: </strong>The Xtreme Gradient Boosting model stood out, achieving 96.72% accuracy, 96.87% sensitivity, 96.57% specificity, 96.61% precision, 96.73% of F1-Score, and 0.9498 AUC. Multilayer Perceptron showed the best sensitivity (98.44%), and Random Forest presented comparable results. In contrast, Support Vector Machine underperformed, with higher values of false negatives and false positives. Across models, ulceration, multifocality, and long-standing lesions were the strongest predictors of MT, while small, asymptomatic, or solitary lesions were associated with lower risk.</p><p><strong>Conclusion: </strong>The results revealed promising performance metrics for Xtreme Gradient Boosting and Multilayer Perceptron suggesting their potential value as tools in a support system for monitoring AC. Additionally, synthetic data proved constructive in training, enhancing the models' robustness and predictive capabilities.</p>","PeriodicalId":16588,"journal":{"name":"Journal of Oral Pathology & Medicine","volume":" ","pages":"574-582"},"PeriodicalIF":2.3000,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Oral Pathology & Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1111/jop.70113","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/1/19 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: This study aimed to develop and evaluate Machine Learning models to predict the malignant transformation (MT) in patients with actinic cheilitis (AC).
Methods: Three hundred forty patients diagnosed with AC (322 in the no MT group, and 18 in the MT group) were carefully documented. The study used the Adaptive Synthetic Sampling to adaptively balance the dataset (322 in the no MT group and 319 in the MT group). Four supervised Machine Learning classifiers (Random Forest, Xtreme Gradient Boosting, Multilayer Perceptron, and Support Vector Machine) were trained and tested using 5-fold cross-validation to correlate inputs (clinical descriptors and demographic data) to outputs (MT). SHAP values were used to identify the most influential predictors of MT.
Results: The Xtreme Gradient Boosting model stood out, achieving 96.72% accuracy, 96.87% sensitivity, 96.57% specificity, 96.61% precision, 96.73% of F1-Score, and 0.9498 AUC. Multilayer Perceptron showed the best sensitivity (98.44%), and Random Forest presented comparable results. In contrast, Support Vector Machine underperformed, with higher values of false negatives and false positives. Across models, ulceration, multifocality, and long-standing lesions were the strongest predictors of MT, while small, asymptomatic, or solitary lesions were associated with lower risk.
Conclusion: The results revealed promising performance metrics for Xtreme Gradient Boosting and Multilayer Perceptron suggesting their potential value as tools in a support system for monitoring AC. Additionally, synthetic data proved constructive in training, enhancing the models' robustness and predictive capabilities.
期刊介绍:
The aim of the Journal of Oral Pathology & Medicine is to publish manuscripts of high scientific quality representing original clinical, diagnostic or experimental work in oral pathology and oral medicine. Papers advancing the science or practice of these disciplines will be welcomed, especially those which bring new knowledge and observations from the application of techniques within the spheres of light and electron microscopy, tissue and organ culture, immunology, histochemistry and immunocytochemistry, microbiology, genetics and biochemistry.