{"title":"Machine learning methods for basal area prediction of Fagus orientalis Lipsky stands based on national forest inventory","authors":"Seyedeh Fatemeh Hosseini, Hamid Jalilvand, Asghar Fallah, Hamed Asadi, Mahya Tafazoli","doi":"10.1007/s00468-025-02616-y","DOIUrl":null,"url":null,"abstract":"<div><h3>Key Message</h3><p>Machine learning models accurately predict <i>F. orientalis</i> stand basal area in the Hyrcanian forest using environmental variables, with the RF model performing best. Elevation is the most important predictor.</p><h3>Abstract</h3><p>Accurate prediction of tree basal area (BA) as an important forest stand structural characteristic is essential for sustainable forest management. The aim of this study was to use four machine learning methods, including generalized linear model (GLM), k-nearest neighbors (KNN), support vector machine (SVM), and random forest (RF), to predict and assess the stand BA of <i>Fagus orientalis</i> Lipsky using national forest inventory data and a comprehensive set environmental variables. Modeling was performed using a 10-fold spatial cross-validation technique to counteract the effect of spatial auto-correlation in predictor and response data, as well as to reduce the dependency between training and test data. The RF model outperformed the others by having the best match between measured and predicted stand BA values, with the highest squared correlation coefficient (<span>\\({R}_{\\text{Train}}^{2}\\)</span> = 0.77; <span>\\({R}_{\\text{Test}}^{2}\\)</span>= 0.76) and the lowest root mean square error (<span>\\({\\text{RMSE}}_{\\text{Train}}\\)</span>= 2.70; <span>\\({\\text{RMSE}}_{\\text{Test}}\\)</span>= 2.90) and mean absolute error (<span>\\({\\text{MAE}}_{\\text{Train}}\\)</span>= 1.74; <span>\\({\\text{MAE}}_{\\text{Test}}\\)</span>=1.76). Among all investigated variables, elevation showed the highest correlation with stand BA of <i>F. orientalis</i> in the Hyrcanian forest. The relation was positive and restricted to the range of approximately 700 to 1200 m. The RF and GLM models indicated the bulk density as the second-most important variable, whereas the SVM and kNN models indicated the air temperature as the second important variable. In general, this research identifies key variables influencing the stand BA of <i>F. orientalis</i>, providing valuable insights for forest management and conservation efforts. These findings contribute to a better understanding of forest dynamics in the Hyrcanian region and can inform targeted management strategies.</p></div>","PeriodicalId":805,"journal":{"name":"Trees","volume":"39 2","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Trees","FirstCategoryId":"2","ListUrlMain":"https://link.springer.com/article/10.1007/s00468-025-02616-y","RegionNum":3,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"FORESTRY","Score":null,"Total":0}
引用次数: 0
Key Message
Machine learning models accurately predict F. orientalis stand basal area in the Hyrcanian forest using environmental variables, with the RF model performing best. Elevation is the most important predictor.
Abstract
Accurate prediction of tree basal area (BA) as an important forest stand structural characteristic is essential for sustainable forest management. The aim of this study was to use four machine learning methods, including generalized linear model (GLM), k-nearest neighbors (KNN), support vector machine (SVM), and random forest (RF), to predict and assess the stand BA of Fagus orientalis Lipsky using national forest inventory data and a comprehensive set environmental variables. Modeling was performed using a 10-fold spatial cross-validation technique to counteract the effect of spatial auto-correlation in predictor and response data, as well as to reduce the dependency between training and test data. The RF model outperformed the others by having the best match between measured and predicted stand BA values, with the highest squared correlation coefficient (\({R}_{\text{Train}}^{2}\) = 0.77; \({R}_{\text{Test}}^{2}\)= 0.76) and the lowest root mean square error (\({\text{RMSE}}_{\text{Train}}\)= 2.70; \({\text{RMSE}}_{\text{Test}}\)= 2.90) and mean absolute error (\({\text{MAE}}_{\text{Train}}\)= 1.74; \({\text{MAE}}_{\text{Test}}\)=1.76). Among all investigated variables, elevation showed the highest correlation with stand BA of F. orientalis in the Hyrcanian forest. The relation was positive and restricted to the range of approximately 700 to 1200 m. The RF and GLM models indicated the bulk density as the second-most important variable, whereas the SVM and kNN models indicated the air temperature as the second important variable. In general, this research identifies key variables influencing the stand BA of F. orientalis, providing valuable insights for forest management and conservation efforts. These findings contribute to a better understanding of forest dynamics in the Hyrcanian region and can inform targeted management strategies.
期刊介绍:
Trees - Structure and Function publishes original articles on the physiology, biochemistry, functional anatomy, structure and ecology of trees and other woody plants. Also presented are articles concerned with pathology and technological problems, when they contribute to the basic understanding of structure and function of trees. In addition to original articles and short communications, the journal publishes reviews on selected topics concerning the structure and function of trees.