Evaluation of soil quality of cultivated lands with classification and regression-based machine learning algorithms optimization under humid environmental condition
Orhan Dengiz , Pelin Alaboz , Fikret Saygın , Kemal Adem , Emre Yüksek
{"title":"Evaluation of soil quality of cultivated lands with classification and regression-based machine learning algorithms optimization under humid environmental condition","authors":"Orhan Dengiz , Pelin Alaboz , Fikret Saygın , Kemal Adem , Emre Yüksek","doi":"10.1016/j.asr.2024.08.048","DOIUrl":null,"url":null,"abstract":"<div><div>In soil science, machine learning algorithms are preferred for pedotransfer functions due to their rapid data acquisition and high prediction accuracy. The current study aims to evaluate the prediction of soil quality in agricultural lands dominated by the humid Black Sea climate using various algorithms. Both classification and regression-based algorithms (Random Forest-RF, Light Gradient Boosting-LGB, Extreme Gradient Boosting-XGBoost, k-nearest neighbors-kNN, Logistic Regression, multilayer perceptron-MLP, Linear Regression-LR and Bayesian Ridge- BR) were used in the method. The comparison of soil maps is also included. Furthermore, the present study evaluates the Grid Search optimization method with K-Fold Cross Validation (K = 5) for both classification and regression-based algorithms. The prediction of soil quality was performed using class-based and regression-based algorithms. As a result of the study, the RF and XGBoost algorithms achieved an approximate accuracy rate of 92 % in the class-based prediction. In regression-based predictions, the most successful algorithms were BR and LR, with an R<sup>2</sup> Score of 0.84. The Grid Search optimization method was used to improve the R<sup>2</sup> Score, resulting in an increase to 0.90 and 0.88 for BR and LR, respectively. The optimized hyperparameters showed improved performance in predicting the soil quality index. The present study found that Gaussian and Spherical models had the lowest prediction errors in spatial distribution maps. Tree-based algorithms were found to be suitable for class-based prediction of soil quality, while the linear regression method was appropriate for regression predictions. This study is characterized by a rainy climate resulting in acidic soils with high organic matter content. Planning of new studies in different climates and soil properties is recommended.</div></div>","PeriodicalId":50850,"journal":{"name":"Advances in Space Research","volume":"74 11","pages":"Pages 5514-5529"},"PeriodicalIF":2.8000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Space Research","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S027311772400872X","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ASTRONOMY & ASTROPHYSICS","Score":null,"Total":0}
引用次数: 0
Abstract
In soil science, machine learning algorithms are preferred for pedotransfer functions due to their rapid data acquisition and high prediction accuracy. The current study aims to evaluate the prediction of soil quality in agricultural lands dominated by the humid Black Sea climate using various algorithms. Both classification and regression-based algorithms (Random Forest-RF, Light Gradient Boosting-LGB, Extreme Gradient Boosting-XGBoost, k-nearest neighbors-kNN, Logistic Regression, multilayer perceptron-MLP, Linear Regression-LR and Bayesian Ridge- BR) were used in the method. The comparison of soil maps is also included. Furthermore, the present study evaluates the Grid Search optimization method with K-Fold Cross Validation (K = 5) for both classification and regression-based algorithms. The prediction of soil quality was performed using class-based and regression-based algorithms. As a result of the study, the RF and XGBoost algorithms achieved an approximate accuracy rate of 92 % in the class-based prediction. In regression-based predictions, the most successful algorithms were BR and LR, with an R2 Score of 0.84. The Grid Search optimization method was used to improve the R2 Score, resulting in an increase to 0.90 and 0.88 for BR and LR, respectively. The optimized hyperparameters showed improved performance in predicting the soil quality index. The present study found that Gaussian and Spherical models had the lowest prediction errors in spatial distribution maps. Tree-based algorithms were found to be suitable for class-based prediction of soil quality, while the linear regression method was appropriate for regression predictions. This study is characterized by a rainy climate resulting in acidic soils with high organic matter content. Planning of new studies in different climates and soil properties is recommended.
期刊介绍:
The COSPAR publication Advances in Space Research (ASR) is an open journal covering all areas of space research including: space studies of the Earth''s surface, meteorology, climate, the Earth-Moon system, planets and small bodies of the solar system, upper atmospheres, ionospheres and magnetospheres of the Earth and planets including reference atmospheres, space plasmas in the solar system, astrophysics from space, materials sciences in space, fundamental physics in space, space debris, space weather, Earth observations of space phenomena, etc.
NB: Please note that manuscripts related to life sciences as related to space are no more accepted for submission to Advances in Space Research. Such manuscripts should now be submitted to the new COSPAR Journal Life Sciences in Space Research (LSSR).
All submissions are reviewed by two scientists in the field. COSPAR is an interdisciplinary scientific organization concerned with the progress of space research on an international scale. Operating under the rules of ICSU, COSPAR ignores political considerations and considers all questions solely from the scientific viewpoint.