Knowledge-guided machine learning with multivariate sparse data for crop growth modelling

IF 5.6 1区 农林科学 Q1 AGRONOMY
Jingye Han , Liangsheng Shi , Qi Yang , Jin Yu , Ioannis N. Athanasiadis
{"title":"Knowledge-guided machine learning with multivariate sparse data for crop growth modelling","authors":"Jingye Han ,&nbsp;Liangsheng Shi ,&nbsp;Qi Yang ,&nbsp;Jin Yu ,&nbsp;Ioannis N. Athanasiadis","doi":"10.1016/j.fcr.2025.109912","DOIUrl":null,"url":null,"abstract":"<div><h3>Context</h3><div>Process-based crop models are widely used to simulate the crop growth process. However, these models face limitations due to the simplified process representation and challenges in parameter estimation. Machine learning methods, as an emerging paradigm, have shown potential in circumventing these limitations, but they are criticized for their black-box nature that does not necessarily encompass known crop growth mechanisms, and their demand for big data that may be not available in most agricultural applications.</div></div><div><h3>Objective</h3><div>This research aims to propose a deep learning architecture that can leverage agronomic knowledge and sparse observational data for crop multivariable simulation, thereby establishing a novel paradigm for crop growth modeling.</div></div><div><h3>Methods</h3><div>We propose a Deep learning Crop Growth Model (DeepCGM) with a mass-conserving architecture that adheres to the principles of crop growth. Two additional knowledge-guided constraints regarding crop physiology and model convergence are designed to train the model with sparse datasets. An observational dataset from a two-year rice experiment of 105 plots is used to evaluate the DeepCGM against a process-based crop model (ORYZA2000) and two classical deep learning models, also employing augmentation methods. To demonstrate the validity and generalizability of the proposed model, we also conducted a replication case study of a three-year rice experiment totaling 122 plots.</div></div><div><h3>Results</h3><div>The DeepCGM architecture produces physically plausible crop growth curves for all simulated variables, while the classical machine learning models may make unreasonable predictions that violate the law of mass conservation. Furthermore, DeepCGM simulates more accurately the observed growth process when compared with the traditional process-based model, with overall accuracy (weighted normalized mean square error) across all variables improves by 8.3 % (2019) and 16.9 % (2018).</div></div><div><h3>Conclusions</h3><div>Knowledge-guided deep learning can integrate the principal mechanisms of crop growth process with deep learning. It addresses the issue of data scarcity, and thereby facilitating data-driven crop growth modelling with multivariable sparse datasets.</div></div><div><h3>Implications</h3><div>OR SIGNIFICANCE: This study highlights the potential of knowledge-guided deep learning to overcome structural error due to the simplification in conventional crop models and reduce the data requirements of data-driven models. The capacity to autonomously identify multivariable dynamic patterns in crop growth from sparse data suggests a new generation of crop growth models.</div></div>","PeriodicalId":12143,"journal":{"name":"Field Crops Research","volume":"328 ","pages":"Article 109912"},"PeriodicalIF":5.6000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Field Crops Research","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378429025001777","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRONOMY","Score":null,"Total":0}
引用次数: 0

Abstract

Context

Process-based crop models are widely used to simulate the crop growth process. However, these models face limitations due to the simplified process representation and challenges in parameter estimation. Machine learning methods, as an emerging paradigm, have shown potential in circumventing these limitations, but they are criticized for their black-box nature that does not necessarily encompass known crop growth mechanisms, and their demand for big data that may be not available in most agricultural applications.

Objective

This research aims to propose a deep learning architecture that can leverage agronomic knowledge and sparse observational data for crop multivariable simulation, thereby establishing a novel paradigm for crop growth modeling.

Methods

We propose a Deep learning Crop Growth Model (DeepCGM) with a mass-conserving architecture that adheres to the principles of crop growth. Two additional knowledge-guided constraints regarding crop physiology and model convergence are designed to train the model with sparse datasets. An observational dataset from a two-year rice experiment of 105 plots is used to evaluate the DeepCGM against a process-based crop model (ORYZA2000) and two classical deep learning models, also employing augmentation methods. To demonstrate the validity and generalizability of the proposed model, we also conducted a replication case study of a three-year rice experiment totaling 122 plots.

Results

The DeepCGM architecture produces physically plausible crop growth curves for all simulated variables, while the classical machine learning models may make unreasonable predictions that violate the law of mass conservation. Furthermore, DeepCGM simulates more accurately the observed growth process when compared with the traditional process-based model, with overall accuracy (weighted normalized mean square error) across all variables improves by 8.3 % (2019) and 16.9 % (2018).

Conclusions

Knowledge-guided deep learning can integrate the principal mechanisms of crop growth process with deep learning. It addresses the issue of data scarcity, and thereby facilitating data-driven crop growth modelling with multivariable sparse datasets.

Implications

OR SIGNIFICANCE: This study highlights the potential of knowledge-guided deep learning to overcome structural error due to the simplification in conventional crop models and reduce the data requirements of data-driven models. The capacity to autonomously identify multivariable dynamic patterns in crop growth from sparse data suggests a new generation of crop growth models.
基于多变量稀疏数据的知识引导机器学习作物生长建模
基于上下文过程的作物模型被广泛用于模拟作物生长过程。然而,由于过程表示的简化和参数估计的挑战,这些模型面临着局限性。机器学习方法作为一种新兴的范例,已经显示出规避这些限制的潜力,但它们的黑箱性质受到批评,不一定包含已知的作物生长机制,并且它们对大多数农业应用中可能无法获得的大数据的需求。本研究旨在提出一种利用农艺知识和稀疏观测数据进行作物多变量模拟的深度学习架构,从而建立一种新的作物生长建模范式。方法我们提出了一个深度学习作物生长模型(DeepCGM),该模型具有遵循作物生长原则的质量守恒架构。设计了关于作物生理和模型收敛的两个额外的知识引导约束来使用稀疏数据集训练模型。利用为期两年的105块水稻试验的观测数据,利用基于过程的作物模型(ORYZA2000)和两种经典深度学习模型(也采用增强方法)对DeepCGM进行了评估。为了验证该模型的有效性和可推广性,我们还进行了为期三年的122块水稻试验的复制案例研究。结果:DeepCGM架构为所有模拟变量生成物理上合理的作物生长曲线,而经典机器学习模型可能会做出违反质量守恒定律的不合理预测。此外,与传统的基于过程的模型相比,DeepCGM更准确地模拟了观察到的生长过程,所有变量的总体精度(加权归一化均方误差)分别提高了8.3 %(2019)和16.9 %(2018)。结论知识引导的深度学习可以将作物生长过程的主要机制与深度学习相结合。它解决了数据稀缺的问题,从而促进了数据驱动的作物生长模型与多变量稀疏数据集。含义或意义:本研究强调了知识引导深度学习在克服传统作物模型简化导致的结构误差和降低数据驱动模型的数据需求方面的潜力。从稀疏数据中自主识别作物生长多变量动态模式的能力为新一代作物生长模型提出了建议。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Field Crops Research
Field Crops Research 农林科学-农艺学
CiteScore
9.60
自引率
12.10%
发文量
307
审稿时长
46 days
期刊介绍: Field Crops Research is an international journal publishing scientific articles on: √ experimental and modelling research at field, farm and landscape levels on temperate and tropical crops and cropping systems, with a focus on crop ecology and physiology, agronomy, and plant genetics and breeding.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信