{"title":"基于随机森林重要性的特征排序和子集选择在边坡稳定性评估中的应用","authors":"Selçuk Demir, E. Şahin","doi":"10.31590/ejosat.1254337","DOIUrl":null,"url":null,"abstract":"Stability problems of slopes can arise from various factors such as geometrical, geological, seismic etc. For many years, conventional methods such as limit equilibrium method, numerical methods, and statistical methods have been successfully utilized to predict the stability of slopes. On the other hand, several machine learning (ML) attempts have been made for predicting slope stability using datasets available in the literature. The present study aims to build classification models for the assessment of the stability of slopes using the Ranger algorithm. A total of 168 cases with six input parameters (slope height, unit weight, slope angle, cohesion, pore water pressure ratio, and internal friction angle) are used to generate models. In the first step, random forest (RF) feature importance scores of the six features are determined and five different prediction models were produced by reducing the feature numbers of the dataset. The developed models are then assessed using performance metrics and results are compared to choose the best prediction model. According to the obtained results, the feature importance-based feature ranking and subset selection approach (i.e., RF feature importance) affect the performance of the models. It is observed that from the RF feature importance scores, the unit weight is found to be the most influencing feature that affects the stability of slopes for the studied dataset. In addition, the Ranger model developed with five features (Model IV) achieves the highest test accuracy with a value of 90%.","PeriodicalId":12068,"journal":{"name":"European Journal of Science and Technology","volume":"62 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Random Forest Importance-Based Feature Ranking and Subset Selection for Slope Stability Assessment using the Ranger Implementation\",\"authors\":\"Selçuk Demir, E. Şahin\",\"doi\":\"10.31590/ejosat.1254337\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Stability problems of slopes can arise from various factors such as geometrical, geological, seismic etc. For many years, conventional methods such as limit equilibrium method, numerical methods, and statistical methods have been successfully utilized to predict the stability of slopes. On the other hand, several machine learning (ML) attempts have been made for predicting slope stability using datasets available in the literature. The present study aims to build classification models for the assessment of the stability of slopes using the Ranger algorithm. A total of 168 cases with six input parameters (slope height, unit weight, slope angle, cohesion, pore water pressure ratio, and internal friction angle) are used to generate models. In the first step, random forest (RF) feature importance scores of the six features are determined and five different prediction models were produced by reducing the feature numbers of the dataset. The developed models are then assessed using performance metrics and results are compared to choose the best prediction model. According to the obtained results, the feature importance-based feature ranking and subset selection approach (i.e., RF feature importance) affect the performance of the models. It is observed that from the RF feature importance scores, the unit weight is found to be the most influencing feature that affects the stability of slopes for the studied dataset. In addition, the Ranger model developed with five features (Model IV) achieves the highest test accuracy with a value of 90%.\",\"PeriodicalId\":12068,\"journal\":{\"name\":\"European Journal of Science and Technology\",\"volume\":\"62 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Journal of Science and Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.31590/ejosat.1254337\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Science and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31590/ejosat.1254337","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Random Forest Importance-Based Feature Ranking and Subset Selection for Slope Stability Assessment using the Ranger Implementation
Stability problems of slopes can arise from various factors such as geometrical, geological, seismic etc. For many years, conventional methods such as limit equilibrium method, numerical methods, and statistical methods have been successfully utilized to predict the stability of slopes. On the other hand, several machine learning (ML) attempts have been made for predicting slope stability using datasets available in the literature. The present study aims to build classification models for the assessment of the stability of slopes using the Ranger algorithm. A total of 168 cases with six input parameters (slope height, unit weight, slope angle, cohesion, pore water pressure ratio, and internal friction angle) are used to generate models. In the first step, random forest (RF) feature importance scores of the six features are determined and five different prediction models were produced by reducing the feature numbers of the dataset. The developed models are then assessed using performance metrics and results are compared to choose the best prediction model. According to the obtained results, the feature importance-based feature ranking and subset selection approach (i.e., RF feature importance) affect the performance of the models. It is observed that from the RF feature importance scores, the unit weight is found to be the most influencing feature that affects the stability of slopes for the studied dataset. In addition, the Ranger model developed with five features (Model IV) achieves the highest test accuracy with a value of 90%.