Chalachew Muluken Liyew, Stefano Ferraris, Elvira Di Nardo, Rosa Meo
{"title":"A review of feature selection methods for actual evapotranspiration prediction","authors":"Chalachew Muluken Liyew, Stefano Ferraris, Elvira Di Nardo, Rosa Meo","doi":"10.1007/s10462-025-11298-4","DOIUrl":null,"url":null,"abstract":"<div><p>Accurate prediction of actual evapotranspiration (AET) is critical for hydrological modeling, agricultural planning, and climate studies. Machine learning models have emerged as powerful AET prediction tools because they can handle complex, nonlinear relationships in large datasets. However, selecting relevant input features significantly impacts model performance, efficiency, and interpretability. Feature selection techniques reduce high-dimensional datasets by identifying redundant and uncorrelated variables. This paper reviews feature selection approaches for predicting ML-based AETs by analyzing 62 studies; a total of 416 were retrieved from seven digital libraries. Our analysis shows that filtering methods are the most widely used <span>\\((38.8\\%)\\)</span>, followed by manual selection based on domain expertise <span>\\((28.7\\%)\\)</span>, embedded methods <span>\\((17.5\\%)\\)</span>, and wrapper methods <span>\\((11.2\\%)\\)</span>. Dimensionality reduction techniques, such as principal component analysis (PCA), are the least used <span>\\((3.8\\%)\\)</span>. Among machine learning models, Random Forest (RF) and Artificial Neural Networks (ANN) are the most commonly used, with 29 and 27 instances, respectively. The study highlights the strengths and limitations of each category of feature selection, emphasizing the potential of hybrid approaches integrating filter, wrapper, embedded, and manual selection methods. These combinations improve model accuracy, robustness, and generalization, while mitigating overfitting, computational inefficiency, and sensitivity to noise. This review provides insights into optimal feature selection strategies for improving ML-based AET prediction.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 10","pages":""},"PeriodicalIF":13.9000,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11298-4.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-025-11298-4","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate prediction of actual evapotranspiration (AET) is critical for hydrological modeling, agricultural planning, and climate studies. Machine learning models have emerged as powerful AET prediction tools because they can handle complex, nonlinear relationships in large datasets. However, selecting relevant input features significantly impacts model performance, efficiency, and interpretability. Feature selection techniques reduce high-dimensional datasets by identifying redundant and uncorrelated variables. This paper reviews feature selection approaches for predicting ML-based AETs by analyzing 62 studies; a total of 416 were retrieved from seven digital libraries. Our analysis shows that filtering methods are the most widely used \((38.8\%)\), followed by manual selection based on domain expertise \((28.7\%)\), embedded methods \((17.5\%)\), and wrapper methods \((11.2\%)\). Dimensionality reduction techniques, such as principal component analysis (PCA), are the least used \((3.8\%)\). Among machine learning models, Random Forest (RF) and Artificial Neural Networks (ANN) are the most commonly used, with 29 and 27 instances, respectively. The study highlights the strengths and limitations of each category of feature selection, emphasizing the potential of hybrid approaches integrating filter, wrapper, embedded, and manual selection methods. These combinations improve model accuracy, robustness, and generalization, while mitigating overfitting, computational inefficiency, and sensitivity to noise. This review provides insights into optimal feature selection strategies for improving ML-based AET prediction.
期刊介绍:
Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.