Beibei Wang , Xiao Huang , Hongxing He , Conrad Zorn , Jiangnan Wang , Wenzhou Guo , Jiarui Wu , Shengchao Qiao , Lingling Kong , Peifang Wang , Chaoqing Yu
{"title":"碳动态、排放减缓和农田产量优化:用于多变量预测的机器学习框架","authors":"Beibei Wang , Xiao Huang , Hongxing He , Conrad Zorn , Jiangnan Wang , Wenzhou Guo , Jiarui Wu , Shengchao Qiao , Lingling Kong , Peifang Wang , Chaoqing Yu","doi":"10.1016/j.agrformet.2025.110740","DOIUrl":null,"url":null,"abstract":"<div><div>Machine learning (ML) has become a promising approach in agro-ecosystem applications to simulate different fertilization management practices (FMP) and their impact on crop yield, soil organic carbon (SOC) concentration, and greenhouse gas (GHG) emissions. However, existing ML-based studies often focus on predicting single variables and lack a systematic framework, limiting model reliability and practical application. This study addresses these gaps by proposing a systematic, modular framework for ML-based multi-variable prediction in agro-ecosystems. Utilizing a comprehensive field-measured dataset, we developed a multi-variable system to predict crop yield, SOC accumulation, and GHG emissions (N₂O, CH₄) for three staple crops in China. The study evaluates a range of preprocessing techniques and algorithms on model performance and demonstrates the model’s application in the North China Plain (NCP) to identify FMPs that balance productivity with carbon mitigation. Our results show that k-Nearest Neighbors missing data imputation, Local Outlier Factor outlier detection, and the Random Forest ML algorithm combined to deliver the best performance for yield (R² = 0.83), SOC (R² = 0.91), and N₂O emissions (R² = 0.83). These models outperformed other candidates across all environmental, soil, and management subgroups. Notably, models with similar accuracy when evaluated with measured variables can show substantial variability in predicting the effects of FMPs compared to conventional practices, highlighting the importance of robust model selection for providing reliable guidance in optimizing FMPs. Partial Dependence Plot (PDP) analysis revealed distinct phases in SOC accumulation, underscoring the need for input datasets with broad temporal coverage to capture both short-term dynamics and long-term trends in SOC dynamics. Overall across the case study area, we identify that 50% Manure-N substitution can reduce global warming potential by 29.5% for maize and 19.5% for wheat while increasing SOC concentration more than fivefold over 30 years.</div></div>","PeriodicalId":50839,"journal":{"name":"Agricultural and Forest Meteorology","volume":"373 ","pages":"Article 110740"},"PeriodicalIF":5.7000,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Carbon dynamics, emission mitigation, and yield optimization in farmlands: A machine learning framework for multi-variable prediction\",\"authors\":\"Beibei Wang , Xiao Huang , Hongxing He , Conrad Zorn , Jiangnan Wang , Wenzhou Guo , Jiarui Wu , Shengchao Qiao , Lingling Kong , Peifang Wang , Chaoqing Yu\",\"doi\":\"10.1016/j.agrformet.2025.110740\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Machine learning (ML) has become a promising approach in agro-ecosystem applications to simulate different fertilization management practices (FMP) and their impact on crop yield, soil organic carbon (SOC) concentration, and greenhouse gas (GHG) emissions. However, existing ML-based studies often focus on predicting single variables and lack a systematic framework, limiting model reliability and practical application. This study addresses these gaps by proposing a systematic, modular framework for ML-based multi-variable prediction in agro-ecosystems. Utilizing a comprehensive field-measured dataset, we developed a multi-variable system to predict crop yield, SOC accumulation, and GHG emissions (N₂O, CH₄) for three staple crops in China. The study evaluates a range of preprocessing techniques and algorithms on model performance and demonstrates the model’s application in the North China Plain (NCP) to identify FMPs that balance productivity with carbon mitigation. Our results show that k-Nearest Neighbors missing data imputation, Local Outlier Factor outlier detection, and the Random Forest ML algorithm combined to deliver the best performance for yield (R² = 0.83), SOC (R² = 0.91), and N₂O emissions (R² = 0.83). These models outperformed other candidates across all environmental, soil, and management subgroups. Notably, models with similar accuracy when evaluated with measured variables can show substantial variability in predicting the effects of FMPs compared to conventional practices, highlighting the importance of robust model selection for providing reliable guidance in optimizing FMPs. Partial Dependence Plot (PDP) analysis revealed distinct phases in SOC accumulation, underscoring the need for input datasets with broad temporal coverage to capture both short-term dynamics and long-term trends in SOC dynamics. Overall across the case study area, we identify that 50% Manure-N substitution can reduce global warming potential by 29.5% for maize and 19.5% for wheat while increasing SOC concentration more than fivefold over 30 years.</div></div>\",\"PeriodicalId\":50839,\"journal\":{\"name\":\"Agricultural and Forest Meteorology\",\"volume\":\"373 \",\"pages\":\"Article 110740\"},\"PeriodicalIF\":5.7000,\"publicationDate\":\"2025-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Agricultural and Forest Meteorology\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0168192325003594\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRONOMY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Agricultural and Forest Meteorology","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168192325003594","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRONOMY","Score":null,"Total":0}
Carbon dynamics, emission mitigation, and yield optimization in farmlands: A machine learning framework for multi-variable prediction
Machine learning (ML) has become a promising approach in agro-ecosystem applications to simulate different fertilization management practices (FMP) and their impact on crop yield, soil organic carbon (SOC) concentration, and greenhouse gas (GHG) emissions. However, existing ML-based studies often focus on predicting single variables and lack a systematic framework, limiting model reliability and practical application. This study addresses these gaps by proposing a systematic, modular framework for ML-based multi-variable prediction in agro-ecosystems. Utilizing a comprehensive field-measured dataset, we developed a multi-variable system to predict crop yield, SOC accumulation, and GHG emissions (N₂O, CH₄) for three staple crops in China. The study evaluates a range of preprocessing techniques and algorithms on model performance and demonstrates the model’s application in the North China Plain (NCP) to identify FMPs that balance productivity with carbon mitigation. Our results show that k-Nearest Neighbors missing data imputation, Local Outlier Factor outlier detection, and the Random Forest ML algorithm combined to deliver the best performance for yield (R² = 0.83), SOC (R² = 0.91), and N₂O emissions (R² = 0.83). These models outperformed other candidates across all environmental, soil, and management subgroups. Notably, models with similar accuracy when evaluated with measured variables can show substantial variability in predicting the effects of FMPs compared to conventional practices, highlighting the importance of robust model selection for providing reliable guidance in optimizing FMPs. Partial Dependence Plot (PDP) analysis revealed distinct phases in SOC accumulation, underscoring the need for input datasets with broad temporal coverage to capture both short-term dynamics and long-term trends in SOC dynamics. Overall across the case study area, we identify that 50% Manure-N substitution can reduce global warming potential by 29.5% for maize and 19.5% for wheat while increasing SOC concentration more than fivefold over 30 years.
期刊介绍:
Agricultural and Forest Meteorology is an international journal for the publication of original articles and reviews on the inter-relationship between meteorology, agriculture, forestry, and natural ecosystems. Emphasis is on basic and applied scientific research relevant to practical problems in the field of plant and soil sciences, ecology and biogeochemistry as affected by weather as well as climate variability and change. Theoretical models should be tested against experimental data. Articles must appeal to an international audience. Special issues devoted to single topics are also published.
Typical topics include canopy micrometeorology (e.g. canopy radiation transfer, turbulence near the ground, evapotranspiration, energy balance, fluxes of trace gases), micrometeorological instrumentation (e.g., sensors for trace gases, flux measurement instruments, radiation measurement techniques), aerobiology (e.g. the dispersion of pollen, spores, insects and pesticides), biometeorology (e.g. the effect of weather and climate on plant distribution, crop yield, water-use efficiency, and plant phenology), forest-fire/weather interactions, and feedbacks from vegetation to weather and the climate system.