{"title":"Predicting suicidal behavior outcomes: an analysis of key factors and machine learning models.","authors":"Mohammad Bazrafshan, Kourosh Sayehmiri","doi":"10.1186/s12888-024-06273-2","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Suicidal behaviors, which may lead to death (suicide) or survival (suicide attempt), are influenced by various factors. Identifying the specific risk factors for suicidal behavior mortality is critical for improving prevention strategies and clinical interventions. Predicting the outcomes of suicidal behaviors can help identify individuals at higher risk of death, enabling timely and targeted interventions. This study aimed to determine the critical risk factors associated with suicidal behavior mortality and identify an effective classification model for predicting suicidal behavior outcomes.</p><p><strong>Materials and methods: </strong>This study utilized data recorded in the suicidal behavior registry system of hospitals in Ilam Province. In the first phase, duplicate records were removed, and the data was numerically encoded via Python version 3.11; then, the data was analyzed using chi-square and Fisher's exact tests in SPSS version 22 software to identify the factors influencing suicidal behavior mortality. In the second phase, missing data were removed, and the dataset was standardized. Five binary classification algorithms were utilized, including Random Forest, Logistic Regression, and Decision Trees, with hyperparameters optimized using the area under the receiver operating characteristic curve (AUC) and F1 score metrics. These models were compared based on accuracy, recall, precision, F1 score, and AUC.</p><p><strong>Results: </strong>Among 3833 cases of suicidal behavior in various hospitals in Ilam Province, the results indicated that the method of suicidal behavior (P < 0.001), reason for suicidal behavior (P < 0.001), age group (P < 0.001), education level (P < 0.001), marital status (P = 0.004), and employment status (P = 0.042) were significantly associated with suicide. Variables such as the season of suicidal behavior, gender, father's education, and mother's education were not significantly related to suicidal behavior mortality. Furthermore, the random forest model demonstrated the highest area under the ROC curve (0.79) and the highest classification accuracy and F1 score on both the training data (0.85 and 0.2, respectively) and test data (0.86 and 0.31, respectively) for predicting suicidal behaviors outcomes among the models tested.</p><p><strong>Conclusion: </strong>This study identified key factors such as older age, lower education, divorce or widowhood, employment, physical methods, and socioeconomic issues as significant predictors of suicidal behavior outcomes. A combination of statistical models for feature selection and machine learning algorithms for prediction was used, with Random Forest showing the best performance. This approach highlights the potential of integrating statistical methods with machine learning to improve suicide risk prediction and intervention strategies.</p>","PeriodicalId":9029,"journal":{"name":"BMC Psychiatry","volume":"24 1","pages":"841"},"PeriodicalIF":3.4000,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11583731/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Psychiatry","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12888-024-06273-2","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PSYCHIATRY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Suicidal behaviors, which may lead to death (suicide) or survival (suicide attempt), are influenced by various factors. Identifying the specific risk factors for suicidal behavior mortality is critical for improving prevention strategies and clinical interventions. Predicting the outcomes of suicidal behaviors can help identify individuals at higher risk of death, enabling timely and targeted interventions. This study aimed to determine the critical risk factors associated with suicidal behavior mortality and identify an effective classification model for predicting suicidal behavior outcomes.
Materials and methods: This study utilized data recorded in the suicidal behavior registry system of hospitals in Ilam Province. In the first phase, duplicate records were removed, and the data was numerically encoded via Python version 3.11; then, the data was analyzed using chi-square and Fisher's exact tests in SPSS version 22 software to identify the factors influencing suicidal behavior mortality. In the second phase, missing data were removed, and the dataset was standardized. Five binary classification algorithms were utilized, including Random Forest, Logistic Regression, and Decision Trees, with hyperparameters optimized using the area under the receiver operating characteristic curve (AUC) and F1 score metrics. These models were compared based on accuracy, recall, precision, F1 score, and AUC.
Results: Among 3833 cases of suicidal behavior in various hospitals in Ilam Province, the results indicated that the method of suicidal behavior (P < 0.001), reason for suicidal behavior (P < 0.001), age group (P < 0.001), education level (P < 0.001), marital status (P = 0.004), and employment status (P = 0.042) were significantly associated with suicide. Variables such as the season of suicidal behavior, gender, father's education, and mother's education were not significantly related to suicidal behavior mortality. Furthermore, the random forest model demonstrated the highest area under the ROC curve (0.79) and the highest classification accuracy and F1 score on both the training data (0.85 and 0.2, respectively) and test data (0.86 and 0.31, respectively) for predicting suicidal behaviors outcomes among the models tested.
Conclusion: This study identified key factors such as older age, lower education, divorce or widowhood, employment, physical methods, and socioeconomic issues as significant predictors of suicidal behavior outcomes. A combination of statistical models for feature selection and machine learning algorithms for prediction was used, with Random Forest showing the best performance. This approach highlights the potential of integrating statistical methods with machine learning to improve suicide risk prediction and intervention strategies.
期刊介绍:
BMC Psychiatry is an open access, peer-reviewed journal that considers articles on all aspects of the prevention, diagnosis and management of psychiatric disorders, as well as related molecular genetics, pathophysiology, and epidemiology.