Richard J. Fischer, Hossein Moradi Rekabdarkolaee, Deepak R. Joshi, David E. Clay, Sharon A. Clay
{"title":"Soybean prediction using computationally efficient Bayesian spatial regression models and satellite imagery","authors":"Richard J. Fischer, Hossein Moradi Rekabdarkolaee, Deepak R. Joshi, David E. Clay, Sharon A. Clay","doi":"10.1002/agj2.21670","DOIUrl":null,"url":null,"abstract":"<p>Preharvest yield estimates can be used for harvest planning, marketing, and prescribing in-season fertilizer and pesticide applications. One approach that is being widely tested is the use of machine learning (ML) or artificial intelligence (AI) algorithms to estimate yields. However, one barrier to the adoption of this approach is that ML/AI algorithms behave as a black block. An alternative approach is to create an algorithm using Bayesian statistics. In Bayesian statistics, prior information is used to help create the algorithm. However, algorithms based on Bayesian statistics are not often computationally efficient. The objective of the current study was to compare the accuracy and computational efficiency of four Bayesian models that used different assumptions to reduce the execution time. In this paper, the Bayesian multiple linear regression (BLR), Bayesian spatial, Bayesian skewed spatial regression, and the Bayesian nearest neighbor Gaussian process (NNGP) models were compared with ML non-Bayesian random forest model. In this analysis, soybean (<i>Glycine max</i>) yields were the response variable (<i>y</i>), and spaced-based blue, green, red, and near-infrared reflectance that was measured with the PlanetScope satellite were the predictor (<i>x</i>). Among the models tested, the Bayesian (NNGP; <i>R</i><sup>2</sup>-testing = 0.485) model, which captures the short-range correlation, outperformed the (BLR; <i>R</i><sup>2</sup>-testing = 0.02), Bayesian spatial regression (SRM; <i>R</i><sup>2</sup>-testing = 0.087), and Bayesian skewed spatial regression (sSRM; <i>R</i><sup>2</sup>-testing = 0.236) models. However, associated with improved accuracy was an increase in run time from 534 s for the BLR model to 2047 s for the NNGP model. These data show that relatively accurate within-field yield estimates can be obtained without sacrificing computational efficiency and that the coefficients have biological meaning. However, all Bayesian models had lower <i>R</i><sup>2</sup> values and higher execution times than the random forest model.</p>","PeriodicalId":7522,"journal":{"name":"Agronomy Journal","volume":"116 6","pages":"2841-2849"},"PeriodicalIF":2.0000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Agronomy Journal","FirstCategoryId":"97","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/agj2.21670","RegionNum":3,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AGRONOMY","Score":null,"Total":0}
引用次数: 0
Abstract
Preharvest yield estimates can be used for harvest planning, marketing, and prescribing in-season fertilizer and pesticide applications. One approach that is being widely tested is the use of machine learning (ML) or artificial intelligence (AI) algorithms to estimate yields. However, one barrier to the adoption of this approach is that ML/AI algorithms behave as a black block. An alternative approach is to create an algorithm using Bayesian statistics. In Bayesian statistics, prior information is used to help create the algorithm. However, algorithms based on Bayesian statistics are not often computationally efficient. The objective of the current study was to compare the accuracy and computational efficiency of four Bayesian models that used different assumptions to reduce the execution time. In this paper, the Bayesian multiple linear regression (BLR), Bayesian spatial, Bayesian skewed spatial regression, and the Bayesian nearest neighbor Gaussian process (NNGP) models were compared with ML non-Bayesian random forest model. In this analysis, soybean (Glycine max) yields were the response variable (y), and spaced-based blue, green, red, and near-infrared reflectance that was measured with the PlanetScope satellite were the predictor (x). Among the models tested, the Bayesian (NNGP; R2-testing = 0.485) model, which captures the short-range correlation, outperformed the (BLR; R2-testing = 0.02), Bayesian spatial regression (SRM; R2-testing = 0.087), and Bayesian skewed spatial regression (sSRM; R2-testing = 0.236) models. However, associated with improved accuracy was an increase in run time from 534 s for the BLR model to 2047 s for the NNGP model. These data show that relatively accurate within-field yield estimates can be obtained without sacrificing computational efficiency and that the coefficients have biological meaning. However, all Bayesian models had lower R2 values and higher execution times than the random forest model.
期刊介绍:
After critical review and approval by the editorial board, AJ publishes articles reporting research findings in soil–plant relationships; crop science; soil science; biometry; crop, soil, pasture, and range management; crop, forage, and pasture production and utilization; turfgrass; agroclimatology; agronomic models; integrated pest management; integrated agricultural systems; and various aspects of entomology, weed science, animal science, plant pathology, and agricultural economics as applied to production agriculture.
Notes are published about apparatus, observations, and experimental techniques. Observations usually are limited to studies and reports of unrepeatable phenomena or other unique circumstances. Review and interpretation papers are also published, subject to standard review. Contributions to the Forum section deal with current agronomic issues and questions in brief, thought-provoking form. Such papers are reviewed by the editor in consultation with the editorial board.