Jingye Han , Liangsheng Shi , Qi Yang , Jin Yu , Ioannis N. Athanasiadis
{"title":"基于多变量稀疏数据的知识引导机器学习作物生长建模","authors":"Jingye Han , Liangsheng Shi , Qi Yang , Jin Yu , Ioannis N. Athanasiadis","doi":"10.1016/j.fcr.2025.109912","DOIUrl":null,"url":null,"abstract":"<div><h3>Context</h3><div>Process-based crop models are widely used to simulate the crop growth process. However, these models face limitations due to the simplified process representation and challenges in parameter estimation. Machine learning methods, as an emerging paradigm, have shown potential in circumventing these limitations, but they are criticized for their black-box nature that does not necessarily encompass known crop growth mechanisms, and their demand for big data that may be not available in most agricultural applications.</div></div><div><h3>Objective</h3><div>This research aims to propose a deep learning architecture that can leverage agronomic knowledge and sparse observational data for crop multivariable simulation, thereby establishing a novel paradigm for crop growth modeling.</div></div><div><h3>Methods</h3><div>We propose a Deep learning Crop Growth Model (DeepCGM) with a mass-conserving architecture that adheres to the principles of crop growth. Two additional knowledge-guided constraints regarding crop physiology and model convergence are designed to train the model with sparse datasets. An observational dataset from a two-year rice experiment of 105 plots is used to evaluate the DeepCGM against a process-based crop model (ORYZA2000) and two classical deep learning models, also employing augmentation methods. To demonstrate the validity and generalizability of the proposed model, we also conducted a replication case study of a three-year rice experiment totaling 122 plots.</div></div><div><h3>Results</h3><div>The DeepCGM architecture produces physically plausible crop growth curves for all simulated variables, while the classical machine learning models may make unreasonable predictions that violate the law of mass conservation. Furthermore, DeepCGM simulates more accurately the observed growth process when compared with the traditional process-based model, with overall accuracy (weighted normalized mean square error) across all variables improves by 8.3 % (2019) and 16.9 % (2018).</div></div><div><h3>Conclusions</h3><div>Knowledge-guided deep learning can integrate the principal mechanisms of crop growth process with deep learning. It addresses the issue of data scarcity, and thereby facilitating data-driven crop growth modelling with multivariable sparse datasets.</div></div><div><h3>Implications</h3><div>OR SIGNIFICANCE: This study highlights the potential of knowledge-guided deep learning to overcome structural error due to the simplification in conventional crop models and reduce the data requirements of data-driven models. The capacity to autonomously identify multivariable dynamic patterns in crop growth from sparse data suggests a new generation of crop growth models.</div></div>","PeriodicalId":12143,"journal":{"name":"Field Crops Research","volume":"328 ","pages":"Article 109912"},"PeriodicalIF":5.6000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Knowledge-guided machine learning with multivariate sparse data for crop growth modelling\",\"authors\":\"Jingye Han , Liangsheng Shi , Qi Yang , Jin Yu , Ioannis N. Athanasiadis\",\"doi\":\"10.1016/j.fcr.2025.109912\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Context</h3><div>Process-based crop models are widely used to simulate the crop growth process. However, these models face limitations due to the simplified process representation and challenges in parameter estimation. Machine learning methods, as an emerging paradigm, have shown potential in circumventing these limitations, but they are criticized for their black-box nature that does not necessarily encompass known crop growth mechanisms, and their demand for big data that may be not available in most agricultural applications.</div></div><div><h3>Objective</h3><div>This research aims to propose a deep learning architecture that can leverage agronomic knowledge and sparse observational data for crop multivariable simulation, thereby establishing a novel paradigm for crop growth modeling.</div></div><div><h3>Methods</h3><div>We propose a Deep learning Crop Growth Model (DeepCGM) with a mass-conserving architecture that adheres to the principles of crop growth. Two additional knowledge-guided constraints regarding crop physiology and model convergence are designed to train the model with sparse datasets. An observational dataset from a two-year rice experiment of 105 plots is used to evaluate the DeepCGM against a process-based crop model (ORYZA2000) and two classical deep learning models, also employing augmentation methods. To demonstrate the validity and generalizability of the proposed model, we also conducted a replication case study of a three-year rice experiment totaling 122 plots.</div></div><div><h3>Results</h3><div>The DeepCGM architecture produces physically plausible crop growth curves for all simulated variables, while the classical machine learning models may make unreasonable predictions that violate the law of mass conservation. Furthermore, DeepCGM simulates more accurately the observed growth process when compared with the traditional process-based model, with overall accuracy (weighted normalized mean square error) across all variables improves by 8.3 % (2019) and 16.9 % (2018).</div></div><div><h3>Conclusions</h3><div>Knowledge-guided deep learning can integrate the principal mechanisms of crop growth process with deep learning. It addresses the issue of data scarcity, and thereby facilitating data-driven crop growth modelling with multivariable sparse datasets.</div></div><div><h3>Implications</h3><div>OR SIGNIFICANCE: This study highlights the potential of knowledge-guided deep learning to overcome structural error due to the simplification in conventional crop models and reduce the data requirements of data-driven models. The capacity to autonomously identify multivariable dynamic patterns in crop growth from sparse data suggests a new generation of crop growth models.</div></div>\",\"PeriodicalId\":12143,\"journal\":{\"name\":\"Field Crops Research\",\"volume\":\"328 \",\"pages\":\"Article 109912\"},\"PeriodicalIF\":5.6000,\"publicationDate\":\"2025-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Field Crops Research\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0378429025001777\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRONOMY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Field Crops Research","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378429025001777","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRONOMY","Score":null,"Total":0}
Knowledge-guided machine learning with multivariate sparse data for crop growth modelling
Context
Process-based crop models are widely used to simulate the crop growth process. However, these models face limitations due to the simplified process representation and challenges in parameter estimation. Machine learning methods, as an emerging paradigm, have shown potential in circumventing these limitations, but they are criticized for their black-box nature that does not necessarily encompass known crop growth mechanisms, and their demand for big data that may be not available in most agricultural applications.
Objective
This research aims to propose a deep learning architecture that can leverage agronomic knowledge and sparse observational data for crop multivariable simulation, thereby establishing a novel paradigm for crop growth modeling.
Methods
We propose a Deep learning Crop Growth Model (DeepCGM) with a mass-conserving architecture that adheres to the principles of crop growth. Two additional knowledge-guided constraints regarding crop physiology and model convergence are designed to train the model with sparse datasets. An observational dataset from a two-year rice experiment of 105 plots is used to evaluate the DeepCGM against a process-based crop model (ORYZA2000) and two classical deep learning models, also employing augmentation methods. To demonstrate the validity and generalizability of the proposed model, we also conducted a replication case study of a three-year rice experiment totaling 122 plots.
Results
The DeepCGM architecture produces physically plausible crop growth curves for all simulated variables, while the classical machine learning models may make unreasonable predictions that violate the law of mass conservation. Furthermore, DeepCGM simulates more accurately the observed growth process when compared with the traditional process-based model, with overall accuracy (weighted normalized mean square error) across all variables improves by 8.3 % (2019) and 16.9 % (2018).
Conclusions
Knowledge-guided deep learning can integrate the principal mechanisms of crop growth process with deep learning. It addresses the issue of data scarcity, and thereby facilitating data-driven crop growth modelling with multivariable sparse datasets.
Implications
OR SIGNIFICANCE: This study highlights the potential of knowledge-guided deep learning to overcome structural error due to the simplification in conventional crop models and reduce the data requirements of data-driven models. The capacity to autonomously identify multivariable dynamic patterns in crop growth from sparse data suggests a new generation of crop growth models.
期刊介绍:
Field Crops Research is an international journal publishing scientific articles on:
√ experimental and modelling research at field, farm and landscape levels
on temperate and tropical crops and cropping systems,
with a focus on crop ecology and physiology, agronomy, and plant genetics and breeding.