Contrasting performance of panel and time-series data models for subnational crop forecasting in Sub-Saharan Africa

IF 5.6 1区农林科学 Q1 AGRONOMY

Agricultural and Forest Meteorology Pub Date : 2024-11-05 DOI:10.1016/j.agrformet.2024.110213

Donghoon Lee , Frank Davenport , Shraddhanand Shukla , Greg Husak , Chris Funk , James Verdin

{"title":"Contrasting performance of panel and time-series data models for subnational crop forecasting in Sub-Saharan Africa","authors":"Donghoon Lee , Frank Davenport , Shraddhanand Shukla , Greg Husak , Chris Funk , James Verdin","doi":"10.1016/j.agrformet.2024.110213","DOIUrl":null,"url":null,"abstract":"<div><div>We comprehensively examine methodologies tailored for subnational crop yield and production forecasting by integrating Earth Observation (EO) datasets and advanced machine learning approaches. We scrutinized diverse input data types, cross-validation methods, and training durations, focusing on maize production and yield predictions in Burkina Faso and Somalia. Central to our analysis is the comparative assessment of using time-invariant features within a panel data (PD) model versus a time-series data (TD) model. The TD model performed well in predicting both production and yield, while the PD model offered comparable yield predictions. Time-invariant features such as livelihood zones, soil properties, and cropland extents enriched the spatial understanding of crop data, enhancing the R-squared by 0.09 (0.21) for production and 0.11 (0.03) for yield, with corresponding reductions in the Mean Absolute Percentage Error by 90 % (238 %) for production and 5 % (4 %) for yield in Burkina Faso (Somalia). While Burkina Faso's consistent crop data allowed for effective modeling with brief training, Somalia benefited from the adaptability of the PD model to crop statistics outliers, particularly with extended training in high-producing regions. The PD approach showed promise in addressing data gaps, although predicting crop productions for unobserved districts remained a challenge. Our findings highlight the harmonious integration of EO data and machine learning in the field of agricultural forecasting and emphasize the importance of region-specific methodologies, especially in the rapidly changing landscape of EO data convergence.</div></div>","PeriodicalId":50839,"journal":{"name":"Agricultural and Forest Meteorology","volume":"359 ","pages":"Article 110213"},"PeriodicalIF":5.6000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Agricultural and Forest Meteorology","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168192324003265","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRONOMY","Score":null,"Total":0}

引用次数: 0

Abstract

We comprehensively examine methodologies tailored for subnational crop yield and production forecasting by integrating Earth Observation (EO) datasets and advanced machine learning approaches. We scrutinized diverse input data types, cross-validation methods, and training durations, focusing on maize production and yield predictions in Burkina Faso and Somalia. Central to our analysis is the comparative assessment of using time-invariant features within a panel data (PD) model versus a time-series data (TD) model. The TD model performed well in predicting both production and yield, while the PD model offered comparable yield predictions. Time-invariant features such as livelihood zones, soil properties, and cropland extents enriched the spatial understanding of crop data, enhancing the R-squared by 0.09 (0.21) for production and 0.11 (0.03) for yield, with corresponding reductions in the Mean Absolute Percentage Error by 90 % (238 %) for production and 5 % (4 %) for yield in Burkina Faso (Somalia). While Burkina Faso's consistent crop data allowed for effective modeling with brief training, Somalia benefited from the adaptability of the PD model to crop statistics outliers, particularly with extended training in high-producing regions. The PD approach showed promise in addressing data gaps, although predicting crop productions for unobserved districts remained a challenge. Our findings highlight the harmonious integration of EO data and machine learning in the field of agricultural forecasting and emphasize the importance of region-specific methodologies, especially in the rapidly changing landscape of EO data convergence.

查看原文本刊更多论文

用于撒哈拉以南非洲国家以下作物预测的面板数据模型和时间序列数据模型的性能对比

通过整合地球观测（EO）数据集和先进的机器学习方法，我们全面研究了为国家以下各级作物产量和生产预测量身定制的方法。我们仔细研究了各种输入数据类型、交叉验证方法和训练持续时间，重点关注布基纳法索和索马里的玉米产量和生产预测。我们分析的核心是比较评估在面板数据（PD）模型和时间序列数据（TD）模型中使用时间不变特征的情况。时间序列数据模型在预测产量和产值方面表现出色，而面板数据模型在预测产值方面表现相当。生计区、土壤特性和耕地范围等时间不变特征丰富了对作物数据的空间理解，使布基纳法索（索马里）的产量 R 方提高了 0.09 (0.21)，单产 R 方提高了 0.11 (0.03)，产量平均绝对百分比误差相应减少了 90% (238%)，单产平均绝对百分比误差减少了 5% (4%)。布基纳法索的作物数据连贯一致，因此只需短期培训就能有效建模，而索马里则得益于 PD 模型对作物统计数据异常值的适应性，特别是在高产地区进行了长期培训。尽管预测未观察地区的作物产量仍是一项挑战，但作物产量预测方法在解决数据缺口方面显示出了前景。我们的研究结果突显了地球观测数据与机器学习在农业预测领域的和谐融合，并强调了针对特定地区的方法的重要性，尤其是在地球观测数据融合快速变化的形势下。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Agricultural and Forest Meteorology 农林科学-林学

CiteScore

10.30

自引率

9.70%

发文量

415

审稿时长

69 days

期刊介绍： Agricultural and Forest Meteorology is an international journal for the publication of original articles and reviews on the inter-relationship between meteorology, agriculture, forestry, and natural ecosystems. Emphasis is on basic and applied scientific research relevant to practical problems in the field of plant and soil sciences, ecology and biogeochemistry as affected by weather as well as climate variability and change. Theoretical models should be tested against experimental data. Articles must appeal to an international audience. Special issues devoted to single topics are also published. Typical topics include canopy micrometeorology (e.g. canopy radiation transfer, turbulence near the ground, evapotranspiration, energy balance, fluxes of trace gases), micrometeorological instrumentation (e.g., sensors for trace gases, flux measurement instruments, radiation measurement techniques), aerobiology (e.g. the dispersion of pollen, spores, insects and pesticides), biometeorology (e.g. the effect of weather and climate on plant distribution, crop yield, water-use efficiency, and plant phenology), forest-fire/weather interactions, and feedbacks from vegetation to weather and the climate system.