{"title":"Prediction model of dental caries in 12-year-old children in Sichuan Province based on machine learning.","authors":"Xinmiao Yan, Taolan Sun, Yuhang Lu, Xin Tan, Zhuo Wang, Miaojing Li","doi":"10.7518/hxkq.2023.2023124","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>The machine learning algorithm was used to construct a prediction model of children's dental caries to determine the risk factors of dental caries in children and put forward targeted measures and policy suggestions to improve children's oral health.</p><p><strong>Methods: </strong>Stratified cluster random sampling was adopted in this study. In accordance with different policies and measures in Sichuan Province, 12-year-old students from 3-4 middle schools in eight cities of Sichuan Province were randomly selected for questionnaire survey, oral examination, and physical examination. Multivariate logistic regression analysis of risk factors for dental caries in 12-year-old children was conducted. The dataset was randomly divided into training set and validation set at a ratio of 7∶3. Four machine learning algorithms, including random forest, decision tree, extreme gradient boosting (XGBoost), and Logistic regression, were constructed using R version 4.1.1, and the prediction effects of the four prediction models were evaluated using the area under receiver operating characteristic curve (AUC).</p><p><strong>Results: </strong>A total of 4 439 children aged 12 years were included in this study. The incidence of permanent teeth caries was 50.93%. The results of multivariate logistic regression analysis showed that body mass index, highest educational background of the father, highest educational background of the mother, whether to brush teeth, how many times a day, use of toothpaste when brushing teeth, duration of brushing teeth, mouthwash after meals, eating before going to bed after brushing teeth, sweet drinks, snacks, going to dental clinic to examine teeth, and age of brushing teeth were the factors influencing children's dental caries (<i>P</i><0.05). The AUC values predicted by random forest, decision tree, Logistic regression, and XGBoost were 0.840, 0.755, 0.799, and 0.794, respectively. In the random forest model, the variable with the highest contribution was eating before bed after brushing.</p><p><strong>Conclusions: </strong>A prediction model of dental caries in children was established on the basis of random forest, showing good prediction effect. Taking preventive measures for the main factors affecting the occurrence of dental caries in children is beneficial.</p>","PeriodicalId":94028,"journal":{"name":"Hua xi kou qiang yi xue za zhi = Huaxi kouqiang yixue zazhi = West China journal of stomatology","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10722460/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Hua xi kou qiang yi xue za zhi = Huaxi kouqiang yixue zazhi = West China journal of stomatology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.7518/hxkq.2023.2023124","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives: The machine learning algorithm was used to construct a prediction model of children's dental caries to determine the risk factors of dental caries in children and put forward targeted measures and policy suggestions to improve children's oral health.
Methods: Stratified cluster random sampling was adopted in this study. In accordance with different policies and measures in Sichuan Province, 12-year-old students from 3-4 middle schools in eight cities of Sichuan Province were randomly selected for questionnaire survey, oral examination, and physical examination. Multivariate logistic regression analysis of risk factors for dental caries in 12-year-old children was conducted. The dataset was randomly divided into training set and validation set at a ratio of 7∶3. Four machine learning algorithms, including random forest, decision tree, extreme gradient boosting (XGBoost), and Logistic regression, were constructed using R version 4.1.1, and the prediction effects of the four prediction models were evaluated using the area under receiver operating characteristic curve (AUC).
Results: A total of 4 439 children aged 12 years were included in this study. The incidence of permanent teeth caries was 50.93%. The results of multivariate logistic regression analysis showed that body mass index, highest educational background of the father, highest educational background of the mother, whether to brush teeth, how many times a day, use of toothpaste when brushing teeth, duration of brushing teeth, mouthwash after meals, eating before going to bed after brushing teeth, sweet drinks, snacks, going to dental clinic to examine teeth, and age of brushing teeth were the factors influencing children's dental caries (P<0.05). The AUC values predicted by random forest, decision tree, Logistic regression, and XGBoost were 0.840, 0.755, 0.799, and 0.794, respectively. In the random forest model, the variable with the highest contribution was eating before bed after brushing.
Conclusions: A prediction model of dental caries in children was established on the basis of random forest, showing good prediction effect. Taking preventive measures for the main factors affecting the occurrence of dental caries in children is beneficial.