{"title":"Winter wheat yield prediction at a county scale using time series variation features of remote sensing spectra and machine learning","authors":"Guocan Zhu, Chenxuan Zhao, Lili Zhou, Zhenhai Li, Hongchun Zhu","doi":"10.1016/j.eja.2025.127751","DOIUrl":null,"url":null,"abstract":"<div><div>To meet the challenges of population growth, decreasing arable land, and food demand, accurately predicting winter wheat yield is of great significance for national food security. Time series remote sensing data can continuously track the spectral changes during the crop growth cycle, providing a wealth of data to support yield prediction. However, relying solely on raw time-series vegetation index data is often insufficient to accurately reflect the crop's growth status and physiological changes, increasing the uncertainty of prediction results. To address this issue, based on the time series normalized difference vegetation index (NDVI) and normalized difference water index (NDWI) of winter wheat during the whole growth period, this study extracted time series variation features of spectral (dVI), such as the rate of NDVI change during the filling period and the mean NDWI during the growth season of winter wheat, etc. Then, we compared a traditional multiple linear regression (MLR) method and two machine learning (ML) methods, including random forest (RF) and Extreme gradient boosting (XGBOOST), to predict county-level winter wheat yield. The results showed that the prediction accuracy of county winter wheat yield in 2020–2021 was best using the dVI and RF model. R<sup>2</sup>, RMSE and CCC were 0.67, 644.93 kg ha<sup>−1</sup>, 0.80, respectively. Compared with the original time series VIs, the method of combining dVI with RF has been proved to be more reliable and robust. This highlights the importance of incorporating spectral variation features into yield prediction analysis to enhance precision and reliability. Our study illustrated that the combination of ML and dVI is an effective and promising method for crop yield estimation.</div></div>","PeriodicalId":51045,"journal":{"name":"European Journal of Agronomy","volume":"170 ","pages":"Article 127751"},"PeriodicalIF":5.5000,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Agronomy","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1161030125002473","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRONOMY","Score":null,"Total":0}
引用次数: 0
Abstract
To meet the challenges of population growth, decreasing arable land, and food demand, accurately predicting winter wheat yield is of great significance for national food security. Time series remote sensing data can continuously track the spectral changes during the crop growth cycle, providing a wealth of data to support yield prediction. However, relying solely on raw time-series vegetation index data is often insufficient to accurately reflect the crop's growth status and physiological changes, increasing the uncertainty of prediction results. To address this issue, based on the time series normalized difference vegetation index (NDVI) and normalized difference water index (NDWI) of winter wheat during the whole growth period, this study extracted time series variation features of spectral (dVI), such as the rate of NDVI change during the filling period and the mean NDWI during the growth season of winter wheat, etc. Then, we compared a traditional multiple linear regression (MLR) method and two machine learning (ML) methods, including random forest (RF) and Extreme gradient boosting (XGBOOST), to predict county-level winter wheat yield. The results showed that the prediction accuracy of county winter wheat yield in 2020–2021 was best using the dVI and RF model. R2, RMSE and CCC were 0.67, 644.93 kg ha−1, 0.80, respectively. Compared with the original time series VIs, the method of combining dVI with RF has been proved to be more reliable and robust. This highlights the importance of incorporating spectral variation features into yield prediction analysis to enhance precision and reliability. Our study illustrated that the combination of ML and dVI is an effective and promising method for crop yield estimation.
为了应对人口增长、耕地减少和粮食需求的挑战,准确预测冬小麦产量对国家粮食安全具有重要意义。时间序列遥感数据可以连续跟踪作物生长周期的光谱变化,为产量预测提供了丰富的数据支持。然而,单纯依靠原始的时序植被指数数据往往不足以准确反映作物的生长状况和生理变化,增加了预测结果的不确定性。针对这一问题,本研究以冬小麦全生育期时间序列归一化植被指数(NDVI)和归一化水分指数(NDWI)为基础,提取了灌浆期NDVI变化率、冬小麦生长季平均NDWI等光谱(dVI)的时间序列变化特征。然后,我们比较了传统的多元线性回归(MLR)方法和两种机器学习(ML)方法,包括随机森林(RF)和极端梯度增强(XGBOOST),以预测县级冬小麦产量。结果表明,dVI和RF模型对2020-2021年县域冬小麦产量预测精度最高。R2、RMSE和CCC分别为0.67、644.93 kg ha−1、0.80。与原始时间序列VIs相比,将dVI与RF相结合的方法具有更高的可靠性和鲁棒性。这突出了将光谱变化特征纳入良率预测分析以提高精度和可靠性的重要性。我们的研究表明,ML和dVI相结合是一种有效和有前途的作物产量估计方法。
期刊介绍:
The European Journal of Agronomy, the official journal of the European Society for Agronomy, publishes original research papers reporting experimental and theoretical contributions to field-based agronomy and crop science. The journal will consider research at the field level for agricultural, horticultural and tree crops, that uses comprehensive and explanatory approaches. The EJA covers the following topics:
crop physiology
crop production and management including irrigation, fertilization and soil management
agroclimatology and modelling
plant-soil relationships
crop quality and post-harvest physiology
farming and cropping systems
agroecosystems and the environment
crop-weed interactions and management
organic farming
horticultural crops
papers from the European Society for Agronomy bi-annual meetings
In determining the suitability of submitted articles for publication, particular scrutiny is placed on the degree of novelty and significance of the research and the extent to which it adds to existing knowledge in agronomy.