{"title":"高光谱传感器 PRISMA 的最佳特征机器学习回归器堆叠集合(SMOF)用于内陆水域浊度预测。","authors":"Rajarshi Bhattacharjee, Shishir Gaur, Shard Chander, Anurag Ohri, Prashant K Srivastava, Anurag Mishra","doi":"10.1007/s11356-024-35481-2","DOIUrl":null,"url":null,"abstract":"<p><p>Leveraging hyperspectral data across various domains yields substantial benefits, yet managing many spectral bands and identifying the essential ones poses a formidable challenge. This study identifies the most relevant bands within a hyperspectral data cube for turbidity prediction in inland water. Nine machine learning regressors Cat Boost, Decision Trees, Extra Trees, Gradient Boost, Light Gradient Boost (LightGBM), Recursive Feature Elimination (RFE), Random Forest, Support Vector Regressor (SVR), and Xtreme Gradient Boost (XGBoost) have been used to compute the feature importance of the hyperspectral bands for predicting turbidity. Random Forest has outperformed the other models with a mean absolute percentage error (MAPE) of 1.61%, and the R<sup>2</sup> of the linear fit is 0.96. Band 77, with a central wavelength of 1067.61 nm, is the most dominating band regarding feature importance. We have also developed a novel framework for turbidity prediction: Stacked Ensemble with Machine Learning Regressors on Optimal Features (SMOF). It employs a stacking ensemble of the nine regressors mentioned above with Random Forest as both base and meta-model, leveraging feature selection outputs. With this framework, the MAPE (%) reached 1.21, while the R<sup>2</sup> stood at 0.95. The present study also presents a simple statistical algorithm to detect noisy bands in the Hyperspectral Precursor of the Application Mission (PRISMA) data cube. The approach assesses quadrat-wise intra-band spatial coherence using Renyi's entropy thresholding for noisy band segregation. Radiometric calibration error and absorption due to water vapour are the two primary sources of noise within the data cube. Moreover, this research implements the open-source Water Colour Simulator (WASI) to simulate inland water spectra with varied proportions of turbidity. Overall, the study presents an approach to identify noisy bands and integrates the potential wavelengths for turbidity prediction of inland waters.</p>","PeriodicalId":545,"journal":{"name":"Environmental Science and Pollution Research","volume":" ","pages":""},"PeriodicalIF":5.8000,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Stacked Ensemble with Machine Learning Regressors on Optimal Features (SMOF) of hyperspectral sensor PRISMA for inland water turbidity prediction.\",\"authors\":\"Rajarshi Bhattacharjee, Shishir Gaur, Shard Chander, Anurag Ohri, Prashant K Srivastava, Anurag Mishra\",\"doi\":\"10.1007/s11356-024-35481-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Leveraging hyperspectral data across various domains yields substantial benefits, yet managing many spectral bands and identifying the essential ones poses a formidable challenge. This study identifies the most relevant bands within a hyperspectral data cube for turbidity prediction in inland water. Nine machine learning regressors Cat Boost, Decision Trees, Extra Trees, Gradient Boost, Light Gradient Boost (LightGBM), Recursive Feature Elimination (RFE), Random Forest, Support Vector Regressor (SVR), and Xtreme Gradient Boost (XGBoost) have been used to compute the feature importance of the hyperspectral bands for predicting turbidity. Random Forest has outperformed the other models with a mean absolute percentage error (MAPE) of 1.61%, and the R<sup>2</sup> of the linear fit is 0.96. Band 77, with a central wavelength of 1067.61 nm, is the most dominating band regarding feature importance. We have also developed a novel framework for turbidity prediction: Stacked Ensemble with Machine Learning Regressors on Optimal Features (SMOF). It employs a stacking ensemble of the nine regressors mentioned above with Random Forest as both base and meta-model, leveraging feature selection outputs. With this framework, the MAPE (%) reached 1.21, while the R<sup>2</sup> stood at 0.95. The present study also presents a simple statistical algorithm to detect noisy bands in the Hyperspectral Precursor of the Application Mission (PRISMA) data cube. The approach assesses quadrat-wise intra-band spatial coherence using Renyi's entropy thresholding for noisy band segregation. Radiometric calibration error and absorption due to water vapour are the two primary sources of noise within the data cube. Moreover, this research implements the open-source Water Colour Simulator (WASI) to simulate inland water spectra with varied proportions of turbidity. Overall, the study presents an approach to identify noisy bands and integrates the potential wavelengths for turbidity prediction of inland waters.</p>\",\"PeriodicalId\":545,\"journal\":{\"name\":\"Environmental Science and Pollution Research\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":5.8000,\"publicationDate\":\"2024-11-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Environmental Science and Pollution Research\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://doi.org/10.1007/s11356-024-35481-2\",\"RegionNum\":3,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Science and Pollution Research","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1007/s11356-024-35481-2","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
Stacked Ensemble with Machine Learning Regressors on Optimal Features (SMOF) of hyperspectral sensor PRISMA for inland water turbidity prediction.
Leveraging hyperspectral data across various domains yields substantial benefits, yet managing many spectral bands and identifying the essential ones poses a formidable challenge. This study identifies the most relevant bands within a hyperspectral data cube for turbidity prediction in inland water. Nine machine learning regressors Cat Boost, Decision Trees, Extra Trees, Gradient Boost, Light Gradient Boost (LightGBM), Recursive Feature Elimination (RFE), Random Forest, Support Vector Regressor (SVR), and Xtreme Gradient Boost (XGBoost) have been used to compute the feature importance of the hyperspectral bands for predicting turbidity. Random Forest has outperformed the other models with a mean absolute percentage error (MAPE) of 1.61%, and the R2 of the linear fit is 0.96. Band 77, with a central wavelength of 1067.61 nm, is the most dominating band regarding feature importance. We have also developed a novel framework for turbidity prediction: Stacked Ensemble with Machine Learning Regressors on Optimal Features (SMOF). It employs a stacking ensemble of the nine regressors mentioned above with Random Forest as both base and meta-model, leveraging feature selection outputs. With this framework, the MAPE (%) reached 1.21, while the R2 stood at 0.95. The present study also presents a simple statistical algorithm to detect noisy bands in the Hyperspectral Precursor of the Application Mission (PRISMA) data cube. The approach assesses quadrat-wise intra-band spatial coherence using Renyi's entropy thresholding for noisy band segregation. Radiometric calibration error and absorption due to water vapour are the two primary sources of noise within the data cube. Moreover, this research implements the open-source Water Colour Simulator (WASI) to simulate inland water spectra with varied proportions of turbidity. Overall, the study presents an approach to identify noisy bands and integrates the potential wavelengths for turbidity prediction of inland waters.
期刊介绍:
Environmental Science and Pollution Research (ESPR) serves the international community in all areas of Environmental Science and related subjects with emphasis on chemical compounds. This includes:
- Terrestrial Biology and Ecology
- Aquatic Biology and Ecology
- Atmospheric Chemistry
- Environmental Microbiology/Biobased Energy Sources
- Phytoremediation and Ecosystem Restoration
- Environmental Analyses and Monitoring
- Assessment of Risks and Interactions of Pollutants in the Environment
- Conservation Biology and Sustainable Agriculture
- Impact of Chemicals/Pollutants on Human and Animal Health
It reports from a broad interdisciplinary outlook.