Accurate estimation of Jujube leaf chlorophyll content using optimized spectral indices and machine learning methods integrating geospatial information
Nigela Tuerxun , Sulei Naibi , Jianghua Zheng , Renjun Wang , Lei Wang , Binbin Lu , Danlin Yu
{"title":"Accurate estimation of Jujube leaf chlorophyll content using optimized spectral indices and machine learning methods integrating geospatial information","authors":"Nigela Tuerxun , Sulei Naibi , Jianghua Zheng , Renjun Wang , Lei Wang , Binbin Lu , Danlin Yu","doi":"10.1016/j.ecoinf.2024.102980","DOIUrl":null,"url":null,"abstract":"<div><div>Leaf chlorophyll content (LCC) is vital for photosynthesis and ecosystem functioning; it influences carbon, water, and energy exchanges while serving as an indicator of photosynthetic activity and nitrogen levels in precision agriculture. Hyperspectral data enable precise LCC monitoring by extracting spectral indices through optimal band combination (OBC) and predicting LCC with machine learning. However, OBC faces dimensionality issues, and machine learning models often overlook geographical influences, potentially reducing prediction accuracy. This study hypothesizes that developing spectral indices from important wavelengths and integrating geospatial data into machine learning models can address these issues and increase prediction accuracy. To test this hypothesis, a framework was developed that first uses elastic net (EN) and the successive projection algorithm (SPA) for wavelength selection, followed by spectral index creation with OBC and ranking with random forest (RF). Support vector regression (SVR), random forest regression (RFR), and geographically weighted least squares support vector regression (GWLS-SVR) were then used to assess the prediction accuracy. Finally, the optimal variables and regression model were identified. The results revealed that the EN- and SPA-based indices had stronger correlations and importance than defined indices. The double-difference index (DDn) and the anti-reflectance index (ARI) are the most robust three-dimensional and two-dimensional spectral indices, respectively. GWLS-SVR requires fewer indices (1–4) to achieve optimal results, with EN-DDn (2<em>R</em><sub>519</sub>-<em>R</em><sub>775</sub>-<em>R</em><sub>936</sub>)-GWLS-SVR performing best (R<sup>2</sup> = 0.95, RMSE = 0.61, PBIAS = -0.02). This research presents a robust framework with strong adaptability for estimating LCC in a specific study area and region, demonstrating substantial potential for the precise estimation of agroforestry vegetation parameters.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"85 ","pages":"Article 102980"},"PeriodicalIF":5.8000,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecological Informatics","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1574954124005223","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Leaf chlorophyll content (LCC) is vital for photosynthesis and ecosystem functioning; it influences carbon, water, and energy exchanges while serving as an indicator of photosynthetic activity and nitrogen levels in precision agriculture. Hyperspectral data enable precise LCC monitoring by extracting spectral indices through optimal band combination (OBC) and predicting LCC with machine learning. However, OBC faces dimensionality issues, and machine learning models often overlook geographical influences, potentially reducing prediction accuracy. This study hypothesizes that developing spectral indices from important wavelengths and integrating geospatial data into machine learning models can address these issues and increase prediction accuracy. To test this hypothesis, a framework was developed that first uses elastic net (EN) and the successive projection algorithm (SPA) for wavelength selection, followed by spectral index creation with OBC and ranking with random forest (RF). Support vector regression (SVR), random forest regression (RFR), and geographically weighted least squares support vector regression (GWLS-SVR) were then used to assess the prediction accuracy. Finally, the optimal variables and regression model were identified. The results revealed that the EN- and SPA-based indices had stronger correlations and importance than defined indices. The double-difference index (DDn) and the anti-reflectance index (ARI) are the most robust three-dimensional and two-dimensional spectral indices, respectively. GWLS-SVR requires fewer indices (1–4) to achieve optimal results, with EN-DDn (2R519-R775-R936)-GWLS-SVR performing best (R2 = 0.95, RMSE = 0.61, PBIAS = -0.02). This research presents a robust framework with strong adaptability for estimating LCC in a specific study area and region, demonstrating substantial potential for the precise estimation of agroforestry vegetation parameters.
期刊介绍:
The journal Ecological Informatics is devoted to the publication of high quality, peer-reviewed articles on all aspects of computational ecology, data science and biogeography. The scope of the journal takes into account the data-intensive nature of ecology, the growing capacity of information technology to access, harness and leverage complex data as well as the critical need for informing sustainable management in view of global environmental and climate change.
The nature of the journal is interdisciplinary at the crossover between ecology and informatics. It focuses on novel concepts and techniques for image- and genome-based monitoring and interpretation, sensor- and multimedia-based data acquisition, internet-based data archiving and sharing, data assimilation, modelling and prediction of ecological data.