Dhahi Al-Shammari, Yang Chen, Niranjan S. Wimalathunge, Chen Wang, Si Yang Han, Thomas F. A. Bishop
{"title":"Incorporation of mechanistic model outputs as features for data-driven models for yield prediction: a case study on wheat and chickpea","authors":"Dhahi Al-Shammari, Yang Chen, Niranjan S. Wimalathunge, Chen Wang, Si Yang Han, Thomas F. A. Bishop","doi":"10.1007/s11119-024-10184-3","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Introduction</h3><p>Context Data-driven models (DDMs) are increasingly used for crop yield prediction due to their ability to capture complex patterns and relationships. DDMs rely heavily on data inputs to provide predictions. Despite their effectiveness, DDMs can be complemented by inputs derived from mechanistic models (MMs).</p><h3 data-test=\"abstract-sub-heading\">Methods</h3><p>This study investigated enhancing the predictive quality of DDMs by using as features a combination of MMs outputs, specifically biomass and soil moisture, with conventional data sources like satellite imagery, weather, and soil information. Four experiments were performed with different datasets being used for prediction: Experiment 1 combined MM outputs with conventional data; Experiment 2 excluded MM outputs; Experiment 3 was the same as Experiment 1 but all conventional temporal data were omitted; Experiment 4 utilised solely MM outputs. The research encompassed ten field-years of wheat and chickpea yield data, applying the eXtreme Gradient Boosting (XGBOOST) algorithm for model fitting. Performance was evaluated using root mean square error (RMSE) and the concordance correlation coefficient (CCC).</p><h3 data-test=\"abstract-sub-heading\">Results and conclusions</h3><p>The validation results showed that the XGBOOST model had similar predictive power for both crops in Experiments 1, 2, and 3. For chickpeas, the CCC ranged from 0.89 to 0.91 and the RMSE from 0.23 to 0.25 t ha<sup>−1</sup>. For wheat, the CCC ranged from 0.87 to 0.92 and the RMSE from 0.29 to 0.35 t ha<sup>−1</sup>. However, Experiment 4 significantly reduced the model's accuracy, with CCCs dropping to 0.47 for chickpeas and 0.36 for wheat, and RMSEs increasing to 0.46 and 0.65 t ha<sup>−1</sup>, respectively. Ultimately, Experiments 1, 2, and 3 demonstrated comparable effectiveness, but Experiment 3 is recommended for achieving similar predictive quality with a simpler, more interpretable model using biomass and soil moisture alongside non-temporal conventional features.</p>","PeriodicalId":20423,"journal":{"name":"Precision Agriculture","volume":null,"pages":null},"PeriodicalIF":5.4000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Precision Agriculture","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.1007/s11119-024-10184-3","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction
Context Data-driven models (DDMs) are increasingly used for crop yield prediction due to their ability to capture complex patterns and relationships. DDMs rely heavily on data inputs to provide predictions. Despite their effectiveness, DDMs can be complemented by inputs derived from mechanistic models (MMs).
Methods
This study investigated enhancing the predictive quality of DDMs by using as features a combination of MMs outputs, specifically biomass and soil moisture, with conventional data sources like satellite imagery, weather, and soil information. Four experiments were performed with different datasets being used for prediction: Experiment 1 combined MM outputs with conventional data; Experiment 2 excluded MM outputs; Experiment 3 was the same as Experiment 1 but all conventional temporal data were omitted; Experiment 4 utilised solely MM outputs. The research encompassed ten field-years of wheat and chickpea yield data, applying the eXtreme Gradient Boosting (XGBOOST) algorithm for model fitting. Performance was evaluated using root mean square error (RMSE) and the concordance correlation coefficient (CCC).
Results and conclusions
The validation results showed that the XGBOOST model had similar predictive power for both crops in Experiments 1, 2, and 3. For chickpeas, the CCC ranged from 0.89 to 0.91 and the RMSE from 0.23 to 0.25 t ha−1. For wheat, the CCC ranged from 0.87 to 0.92 and the RMSE from 0.29 to 0.35 t ha−1. However, Experiment 4 significantly reduced the model's accuracy, with CCCs dropping to 0.47 for chickpeas and 0.36 for wheat, and RMSEs increasing to 0.46 and 0.65 t ha−1, respectively. Ultimately, Experiments 1, 2, and 3 demonstrated comparable effectiveness, but Experiment 3 is recommended for achieving similar predictive quality with a simpler, more interpretable model using biomass and soil moisture alongside non-temporal conventional features.
期刊介绍:
Precision Agriculture promotes the most innovative results coming from the research in the field of precision agriculture. It provides an effective forum for disseminating original and fundamental research and experience in the rapidly advancing area of precision farming.
There are many topics in the field of precision agriculture; therefore, the topics that are addressed include, but are not limited to:
Natural Resources Variability: Soil and landscape variability, digital elevation models, soil mapping, geostatistics, geographic information systems, microclimate, weather forecasting, remote sensing, management units, scale, etc.
Managing Variability: Sampling techniques, site-specific nutrient and crop protection chemical recommendation, crop quality, tillage, seed density, seed variety, yield mapping, remote sensing, record keeping systems, data interpretation and use, crops (corn, wheat, sugar beets, potatoes, peanut, cotton, vegetables, etc.), management scale, etc.
Engineering Technology: Computers, positioning systems, DGPS, machinery, tillage, planting, nutrient and crop protection implements, manure, irrigation, fertigation, yield monitor and mapping, soil physical and chemical characteristic sensors, weed/pest mapping, etc.
Profitability: MEY, net returns, BMPs, optimum recommendations, crop quality, technology cost, sustainability, social impacts, marketing, cooperatives, farm scale, crop type, etc.
Environment: Nutrient, crop protection chemicals, sediments, leaching, runoff, practices, field, watershed, on/off farm, artificial drainage, ground water, surface water, etc.
Technology Transfer: Skill needs, education, training, outreach, methods, surveys, agri-business, producers, distance education, Internet, simulations models, decision support systems, expert systems, on-farm experimentation, partnerships, quality of rural life, etc.