{"title":"Remote sensing and TerraClimate datasets for wheat yield prediction using machine learning","authors":"Alireza Araghi , Andre Daccache","doi":"10.1016/j.atech.2025.100909","DOIUrl":null,"url":null,"abstract":"<div><div>Ensuring food security for the continuously growing global population has become one of the most significant challenges facing humanity today. This challenge is further exacerbated by the impacts of climate change and environmental degradation, much of which is associated with human activities. Yield prediction is vital for addressing food security challenges at local and regional levels. By anticipating crop production, we can better manage food distribution, mitigate the risks of shortages, and support sustainable agricultural practices. Using biophysical crop models to forecast yields is laborious and necessitates various, often unavailable, pedo-climatic, crop-specific, and management parameters. This study leverages satellite imagery and a gridded climate dataset (TerraClima) with machine learning (ML) to predict wheat yields in Mashhad County (Northeast Iran). The analysis spans over 22 years, from 2001 to 2022. Different ML models were developed and evaluated, including multiple linear regression (MLR), artificial neural network (ANN), random forest (RF), and a mean ensemble (ENS) of the outputs of all selected models. Findings showed that with reasonable accuracy, irrigated and rainfed wheat yields could be predicted using the MLR and ENS models up to 2 months before harvest. The Nash-Sutcliffe efficiency (NSE) values are 0.74 and 0.62, while correlation coefficients (r) are 0.93 and 0.80 for irrigated and rainfed wheat, respectively. The global coverage of the input dataset and its easy access make this approach applicable to various crop types and other regions, thus unlocking the limitation related to the lack of on-site data availability for traditional yield prediction models.</div></div>","PeriodicalId":74813,"journal":{"name":"Smart agricultural technology","volume":"11 ","pages":"Article 100909"},"PeriodicalIF":6.3000,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Smart agricultural technology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S277237552500142X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Ensuring food security for the continuously growing global population has become one of the most significant challenges facing humanity today. This challenge is further exacerbated by the impacts of climate change and environmental degradation, much of which is associated with human activities. Yield prediction is vital for addressing food security challenges at local and regional levels. By anticipating crop production, we can better manage food distribution, mitigate the risks of shortages, and support sustainable agricultural practices. Using biophysical crop models to forecast yields is laborious and necessitates various, often unavailable, pedo-climatic, crop-specific, and management parameters. This study leverages satellite imagery and a gridded climate dataset (TerraClima) with machine learning (ML) to predict wheat yields in Mashhad County (Northeast Iran). The analysis spans over 22 years, from 2001 to 2022. Different ML models were developed and evaluated, including multiple linear regression (MLR), artificial neural network (ANN), random forest (RF), and a mean ensemble (ENS) of the outputs of all selected models. Findings showed that with reasonable accuracy, irrigated and rainfed wheat yields could be predicted using the MLR and ENS models up to 2 months before harvest. The Nash-Sutcliffe efficiency (NSE) values are 0.74 and 0.62, while correlation coefficients (r) are 0.93 and 0.80 for irrigated and rainfed wheat, respectively. The global coverage of the input dataset and its easy access make this approach applicable to various crop types and other regions, thus unlocking the limitation related to the lack of on-site data availability for traditional yield prediction models.