实际蒸散发预测的特征选择方法综述

IF 13.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Artificial Intelligence Review Pub Date : 2025-07-04 DOI:10.1007/s10462-025-11298-4

Chalachew Muluken Liyew, Stefano Ferraris, Elvira Di Nardo, Rosa Meo

{"title":"实际蒸散发预测的特征选择方法综述","authors":"Chalachew Muluken Liyew, Stefano Ferraris, Elvira Di Nardo, Rosa Meo","doi":"10.1007/s10462-025-11298-4","DOIUrl":null,"url":null,"abstract":"<div>Accurate prediction of actual evapotranspiration (AET) is critical for hydrological modeling, agricultural planning, and climate studies. Machine learning models have emerged as powerful AET prediction tools because they can handle complex, nonlinear relationships in large datasets. However, selecting relevant input features significantly impacts model performance, efficiency, and interpretability. Feature selection techniques reduce high-dimensional datasets by identifying redundant and uncorrelated variables. This paper reviews feature selection approaches for predicting ML-based AETs by analyzing 62 studies; a total of 416 were retrieved from seven digital libraries. Our analysis shows that filtering methods are the most widely used \\((38.8\\%)\\), followed by manual selection based on domain expertise \\((28.7\\%)\\), embedded methods \\((17.5\\%)\\), and wrapper methods \\((11.2\\%)\\). Dimensionality reduction techniques, such as principal component analysis (PCA), are the least used \\((3.8\\%)\\). Among machine learning models, Random Forest (RF) and Artificial Neural Networks (ANN) are the most commonly used, with 29 and 27 instances, respectively. The study highlights the strengths and limitations of each category of feature selection, emphasizing the potential of hybrid approaches integrating filter, wrapper, embedded, and manual selection methods. These combinations improve model accuracy, robustness, and generalization, while mitigating overfitting, computational inefficiency, and sensitivity to noise. This review provides insights into optimal feature selection strategies for improving ML-based AET prediction.</div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 10","pages":""},"PeriodicalIF":13.9000,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11298-4.pdf","citationCount":"0","resultStr":"{\"title\":\"A review of feature selection methods for actual evapotranspiration prediction\",\"authors\":\"Chalachew Muluken Liyew, Stefano Ferraris, Elvira Di Nardo, Rosa Meo\",\"doi\":\"10.1007/s10462-025-11298-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>Accurate prediction of actual evapotranspiration (AET) is critical for hydrological modeling, agricultural planning, and climate studies. Machine learning models have emerged as powerful AET prediction tools because they can handle complex, nonlinear relationships in large datasets. However, selecting relevant input features significantly impacts model performance, efficiency, and interpretability. Feature selection techniques reduce high-dimensional datasets by identifying redundant and uncorrelated variables. This paper reviews feature selection approaches for predicting ML-based AETs by analyzing 62 studies; a total of 416 were retrieved from seven digital libraries. Our analysis shows that filtering methods are the most widely used \\\\((38.8\\\\%)\\\\), followed by manual selection based on domain expertise \\\\((28.7\\\\%)\\\\), embedded methods \\\\((17.5\\\\%)\\\\), and wrapper methods \\\\((11.2\\\\%)\\\\). Dimensionality reduction techniques, such as principal component analysis (PCA), are the least used \\\\((3.8\\\\%)\\\\). Among machine learning models, Random Forest (RF) and Artificial Neural Networks (ANN) are the most commonly used, with 29 and 27 instances, respectively. The study highlights the strengths and limitations of each category of feature selection, emphasizing the potential of hybrid approaches integrating filter, wrapper, embedded, and manual selection methods. These combinations improve model accuracy, robustness, and generalization, while mitigating overfitting, computational inefficiency, and sensitivity to noise. This review provides insights into optimal feature selection strategies for improving ML-based AET prediction.</div>\",\"PeriodicalId\":8449,\"journal\":{\"name\":\"Artificial Intelligence Review\",\"volume\":\"58 10\",\"pages\":\"\"},\"PeriodicalIF\":13.9000,\"publicationDate\":\"2025-07-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://link.springer.com/content/pdf/10.1007/s10462-025-11298-4.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence Review\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10462-025-11298-4\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-025-11298-4","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

准确预测实际蒸散发（AET）对水文建模、农业规划和气候研究至关重要。机器学习模型已经成为强大的AET预测工具，因为它们可以处理大型数据集中复杂的非线性关系。然而，选择相关的输入特征会显著影响模型的性能、效率和可解释性。特征选择技术通过识别冗余和不相关的变量来减少高维数据集。本文通过对62项研究的分析，综述了基于ml的aet预测的特征选择方法；从7个数字图书馆共检索到416份。我们的分析表明，过滤方法是最广泛使用的\((38.8\%)\)，其次是基于领域专业知识的手动选择\((28.7\%)\)，嵌入式方法\((17.5\%)\)和包装方法\((11.2\%)\)。降维技术，如主成分分析（PCA），是使用最少的\((3.8\%)\)。在机器学习模型中，随机森林（RF）和人工神经网络（ANN）是最常用的，分别有29和27个实例。该研究强调了每一类特征选择的优势和局限性，强调了整合过滤器、包装器、嵌入式和手动选择方法的混合方法的潜力。这些组合提高了模型的准确性、鲁棒性和泛化，同时减轻了过拟合、计算效率低下和对噪声的敏感性。本文综述了改进基于ml的AET预测的最佳特征选择策略。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A review of feature selection methods for actual evapotranspiration prediction

Accurate prediction of actual evapotranspiration (AET) is critical for hydrological modeling, agricultural planning, and climate studies. Machine learning models have emerged as powerful AET prediction tools because they can handle complex, nonlinear relationships in large datasets. However, selecting relevant input features significantly impacts model performance, efficiency, and interpretability. Feature selection techniques reduce high-dimensional datasets by identifying redundant and uncorrelated variables. This paper reviews feature selection approaches for predicting ML-based AETs by analyzing 62 studies; a total of 416 were retrieved from seven digital libraries. Our analysis shows that filtering methods are the most widely used \((38.8\%)\), followed by manual selection based on domain expertise \((28.7\%)\), embedded methods \((17.5\%)\), and wrapper methods \((11.2\%)\). Dimensionality reduction techniques, such as principal component analysis (PCA), are the least used \((3.8\%)\). Among machine learning models, Random Forest (RF) and Artificial Neural Networks (ANN) are the most commonly used, with 29 and 27 instances, respectively. The study highlights the strengths and limitations of each category of feature selection, emphasizing the potential of hybrid approaches integrating filter, wrapper, embedded, and manual selection methods. These combinations improve model accuracy, robustness, and generalization, while mitigating overfitting, computational inefficiency, and sensitivity to noise. This review provides insights into optimal feature selection strategies for improving ML-based AET prediction.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Artificial Intelligence Review 工程技术-计算机：人工智能

CiteScore

22.00

自引率

3.30%

发文量

194

审稿时长

5.3 months

期刊介绍： Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.