Soybean prediction using computationally efficient Bayesian spatial regression models and satellite imagery

IF 2 3区农林科学 Q2 AGRONOMY

Agronomy Journal Pub Date : 2024-09-03 DOI:10.1002/agj2.21670

Richard J. Fischer, Hossein Moradi Rekabdarkolaee, Deepak R. Joshi, David E. Clay, Sharon A. Clay

{"title":"Soybean prediction using computationally efficient Bayesian spatial regression models and satellite imagery","authors":"Richard J. Fischer, Hossein Moradi Rekabdarkolaee, Deepak R. Joshi, David E. Clay, Sharon A. Clay","doi":"10.1002/agj2.21670","DOIUrl":null,"url":null,"abstract":"Preharvest yield estimates can be used for harvest planning, marketing, and prescribing in-season fertilizer and pesticide applications. One approach that is being widely tested is the use of machine learning (ML) or artificial intelligence (AI) algorithms to estimate yields. However, one barrier to the adoption of this approach is that ML/AI algorithms behave as a black block. An alternative approach is to create an algorithm using Bayesian statistics. In Bayesian statistics, prior information is used to help create the algorithm. However, algorithms based on Bayesian statistics are not often computationally efficient. The objective of the current study was to compare the accuracy and computational efficiency of four Bayesian models that used different assumptions to reduce the execution time. In this paper, the Bayesian multiple linear regression (BLR), Bayesian spatial, Bayesian skewed spatial regression, and the Bayesian nearest neighbor Gaussian process (NNGP) models were compared with ML non-Bayesian random forest model. In this analysis, soybean (Glycine max) yields were the response variable (y), and spaced-based blue, green, red, and near-infrared reflectance that was measured with the PlanetScope satellite were the predictor (x). Among the models tested, the Bayesian (NNGP; R2-testing = 0.485) model, which captures the short-range correlation, outperformed the (BLR; R2-testing = 0.02), Bayesian spatial regression (SRM; R2-testing = 0.087), and Bayesian skewed spatial regression (sSRM; R2-testing = 0.236) models. However, associated with improved accuracy was an increase in run time from 534 s for the BLR model to 2047 s for the NNGP model. These data show that relatively accurate within-field yield estimates can be obtained without sacrificing computational efficiency and that the coefficients have biological meaning. However, all Bayesian models had lower R2 values and higher execution times than the random forest model.","PeriodicalId":7522,"journal":{"name":"Agronomy Journal","volume":"116 6","pages":"2841-2849"},"PeriodicalIF":2.0000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Agronomy Journal","FirstCategoryId":"97","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/agj2.21670","RegionNum":3,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AGRONOMY","Score":null,"Total":0}

引用次数: 0

Abstract

Preharvest yield estimates can be used for harvest planning, marketing, and prescribing in-season fertilizer and pesticide applications. One approach that is being widely tested is the use of machine learning (ML) or artificial intelligence (AI) algorithms to estimate yields. However, one barrier to the adoption of this approach is that ML/AI algorithms behave as a black block. An alternative approach is to create an algorithm using Bayesian statistics. In Bayesian statistics, prior information is used to help create the algorithm. However, algorithms based on Bayesian statistics are not often computationally efficient. The objective of the current study was to compare the accuracy and computational efficiency of four Bayesian models that used different assumptions to reduce the execution time. In this paper, the Bayesian multiple linear regression (BLR), Bayesian spatial, Bayesian skewed spatial regression, and the Bayesian nearest neighbor Gaussian process (NNGP) models were compared with ML non-Bayesian random forest model. In this analysis, soybean (Glycine max) yields were the response variable (y), and spaced-based blue, green, red, and near-infrared reflectance that was measured with the PlanetScope satellite were the predictor (x). Among the models tested, the Bayesian (NNGP; R²-testing = 0.485) model, which captures the short-range correlation, outperformed the (BLR; R²-testing = 0.02), Bayesian spatial regression (SRM; R²-testing = 0.087), and Bayesian skewed spatial regression (sSRM; R²-testing = 0.236) models. However, associated with improved accuracy was an increase in run time from 534 s for the BLR model to 2047 s for the NNGP model. These data show that relatively accurate within-field yield estimates can be obtained without sacrificing computational efficiency and that the coefficients have biological meaning. However, all Bayesian models had lower R² values and higher execution times than the random forest model.

查看原文本刊更多论文

利用计算效率高的贝叶斯空间回归模型和卫星图像进行大豆预测

收获前的产量估算可用于收获规划、市场营销以及制定当季化肥和农药施用量。目前正在广泛测试的一种方法是使用机器学习（ML）或人工智能（AI）算法来估算产量。然而，采用这种方法的一个障碍是，ML/AI 算法的行为就像一个黑盒子。另一种方法是使用贝叶斯统计创建算法。在贝叶斯统计中，先验信息被用来帮助创建算法。然而，基于贝叶斯统计的算法通常计算效率不高。当前研究的目的是比较四种贝叶斯模型的准确性和计算效率，这些模型使用不同的假设来减少执行时间。本文将贝叶斯多元线性回归（BLR）模型、贝叶斯空间模型、贝叶斯倾斜空间回归模型和贝叶斯近邻高斯过程（NNGP）模型与 ML 非贝叶斯随机森林模型进行了比较。在该分析中，大豆（Glycine max）产量是响应变量（y），用 PlanetScope 卫星测量的基于间隔的蓝光、绿光、红光和近红外反射率是预测变量（x）。在测试的模型中，捕捉短程相关性的贝叶斯模型（NNGP；R2 检验 = 0.485）优于贝叶斯空间回归模型（BLR；R2 检验 = 0.02）、贝叶斯空间回归模型（SRM；R2 检验 = 0.087）和贝叶斯倾斜空间回归模型（sSRM；R2 检验 = 0.236）。然而，在提高准确度的同时，运行时间也从 BLR 模型的 534 秒增加到 NNGP 模型的 2047 秒。这些数据表明，在不牺牲计算效率的情况下，可以获得相对准确的田间产量估计值，而且系数具有生物学意义。不过，与随机森林模型相比，所有贝叶斯模型的 R2 值都较低，执行时间也较长。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Agronomy Journal 农林科学-农艺学

CiteScore

4.70

自引率

9.50%

发文量

265

审稿时长

4.8 months

期刊介绍： After critical review and approval by the editorial board, AJ publishes articles reporting research findings in soil–plant relationships; crop science; soil science; biometry; crop, soil, pasture, and range management; crop, forage, and pasture production and utilization; turfgrass; agroclimatology; agronomic models; integrated pest management; integrated agricultural systems; and various aspects of entomology, weed science, animal science, plant pathology, and agricultural economics as applied to production agriculture. Notes are published about apparatus, observations, and experimental techniques. Observations usually are limited to studies and reports of unrepeatable phenomena or other unique circumstances. Review and interpretation papers are also published, subject to standard review. Contributions to the Forum section deal with current agronomic issues and questions in brief, thought-provoking form. Such papers are reviewed by the editor in consultation with the editorial board.