Yue Li , Yuxin Miao , Akshat Rawat , Kirk Stueve , Weixing Cao
{"title":"利用集成机器学习识别影响产量空间格局的关键土壤和景观因素,用于划定管理区","authors":"Yue Li , Yuxin Miao , Akshat Rawat , Kirk Stueve , Weixing Cao","doi":"10.1016/j.compag.2025.110487","DOIUrl":null,"url":null,"abstract":"<div><div>Dividing a field into a few homogenous management zones (MZs) is a promising strategy for precision agriculture. Previous studies demonstrated that Yield Spatial Trend (YST) map, yield temporal stability map and soil-landscape factors should be combined for MZ delineation. The soil-landscape factors should significantly influence crop yield, but how to identify the specific soil-landscape factors for MZ delineation has not been sufficiently studied. The objectives of this study were to: 1) evaluate the performance of different Machine Learning (ML) algorithms for predicting YST; 2) compare the effectiveness of freely available soil and landscape data and different soil sensing data in explaining YST variability; 3) identify key soil landscape factors influencing YST using the ensemble stacking regression model based on various data sources; and 4) evaluate the identified key factors based on free dataset for MZ delineation. The study was based on two corn (<em>Zea mays</em> L.)-soybean (<em>Glycine</em> max L.) rotation fields in western Minnesota, USA, covering data from 2014 to 2023. Eleven ML models, along with an ensemble model combining stacking techniques and SHapley Additive Explanations (SHAP) values, were trained and evaluated for predicting YST and identifying key factors. Recursive feature elimination was employed to determine optimal variables from multiple datasets, including free data, free data combined with SoilOptix sensor data, free data combined with Veris sensor data, and the combined full datasets. The results demonstrated that the stacking model consistently achieved the highest accuracy (R<sup>2</sup> = 0.54–0.79) compared to other ML models (R<sup>2</sup> = 0.02–0.73). While integrating SoilOptix and Veris sensor data improved model performance (R<sup>2</sup> = 0.66–0.79), using only the free dataset was sufficient to explain 54 %-72 % of the YST variability, making this approach both cost-effective and scalable for farmers. The key factors influencing YST varied between the two fields: relative elevation, slope, and soil brightness index were most important in Field 1, while soil organic matter, relative elevation, and bulk density were more critical in Field 2. Using only ten and three key factors in Field 1 and Field 2 could explain approximately 51 % and 64 % of the YST variability, respectively. More studies are needed to develop practical and efficient MZ delineation methods and zone-specific management strategies to improve crop productivity, resource use efficiency, economic profitability, and environmental sustainability.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"237 ","pages":"Article 110487"},"PeriodicalIF":8.9000,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Identifying key soil and landscape factors influencing yield spatial patterns for management zone delineation using ensemble machine learning\",\"authors\":\"Yue Li , Yuxin Miao , Akshat Rawat , Kirk Stueve , Weixing Cao\",\"doi\":\"10.1016/j.compag.2025.110487\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Dividing a field into a few homogenous management zones (MZs) is a promising strategy for precision agriculture. Previous studies demonstrated that Yield Spatial Trend (YST) map, yield temporal stability map and soil-landscape factors should be combined for MZ delineation. The soil-landscape factors should significantly influence crop yield, but how to identify the specific soil-landscape factors for MZ delineation has not been sufficiently studied. The objectives of this study were to: 1) evaluate the performance of different Machine Learning (ML) algorithms for predicting YST; 2) compare the effectiveness of freely available soil and landscape data and different soil sensing data in explaining YST variability; 3) identify key soil landscape factors influencing YST using the ensemble stacking regression model based on various data sources; and 4) evaluate the identified key factors based on free dataset for MZ delineation. The study was based on two corn (<em>Zea mays</em> L.)-soybean (<em>Glycine</em> max L.) rotation fields in western Minnesota, USA, covering data from 2014 to 2023. Eleven ML models, along with an ensemble model combining stacking techniques and SHapley Additive Explanations (SHAP) values, were trained and evaluated for predicting YST and identifying key factors. Recursive feature elimination was employed to determine optimal variables from multiple datasets, including free data, free data combined with SoilOptix sensor data, free data combined with Veris sensor data, and the combined full datasets. The results demonstrated that the stacking model consistently achieved the highest accuracy (R<sup>2</sup> = 0.54–0.79) compared to other ML models (R<sup>2</sup> = 0.02–0.73). While integrating SoilOptix and Veris sensor data improved model performance (R<sup>2</sup> = 0.66–0.79), using only the free dataset was sufficient to explain 54 %-72 % of the YST variability, making this approach both cost-effective and scalable for farmers. The key factors influencing YST varied between the two fields: relative elevation, slope, and soil brightness index were most important in Field 1, while soil organic matter, relative elevation, and bulk density were more critical in Field 2. Using only ten and three key factors in Field 1 and Field 2 could explain approximately 51 % and 64 % of the YST variability, respectively. More studies are needed to develop practical and efficient MZ delineation methods and zone-specific management strategies to improve crop productivity, resource use efficiency, economic profitability, and environmental sustainability.</div></div>\",\"PeriodicalId\":50627,\"journal\":{\"name\":\"Computers and Electronics in Agriculture\",\"volume\":\"237 \",\"pages\":\"Article 110487\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-05-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers and Electronics in Agriculture\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0168169925005939\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURE, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers and Electronics in Agriculture","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168169925005939","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
Identifying key soil and landscape factors influencing yield spatial patterns for management zone delineation using ensemble machine learning
Dividing a field into a few homogenous management zones (MZs) is a promising strategy for precision agriculture. Previous studies demonstrated that Yield Spatial Trend (YST) map, yield temporal stability map and soil-landscape factors should be combined for MZ delineation. The soil-landscape factors should significantly influence crop yield, but how to identify the specific soil-landscape factors for MZ delineation has not been sufficiently studied. The objectives of this study were to: 1) evaluate the performance of different Machine Learning (ML) algorithms for predicting YST; 2) compare the effectiveness of freely available soil and landscape data and different soil sensing data in explaining YST variability; 3) identify key soil landscape factors influencing YST using the ensemble stacking regression model based on various data sources; and 4) evaluate the identified key factors based on free dataset for MZ delineation. The study was based on two corn (Zea mays L.)-soybean (Glycine max L.) rotation fields in western Minnesota, USA, covering data from 2014 to 2023. Eleven ML models, along with an ensemble model combining stacking techniques and SHapley Additive Explanations (SHAP) values, were trained and evaluated for predicting YST and identifying key factors. Recursive feature elimination was employed to determine optimal variables from multiple datasets, including free data, free data combined with SoilOptix sensor data, free data combined with Veris sensor data, and the combined full datasets. The results demonstrated that the stacking model consistently achieved the highest accuracy (R2 = 0.54–0.79) compared to other ML models (R2 = 0.02–0.73). While integrating SoilOptix and Veris sensor data improved model performance (R2 = 0.66–0.79), using only the free dataset was sufficient to explain 54 %-72 % of the YST variability, making this approach both cost-effective and scalable for farmers. The key factors influencing YST varied between the two fields: relative elevation, slope, and soil brightness index were most important in Field 1, while soil organic matter, relative elevation, and bulk density were more critical in Field 2. Using only ten and three key factors in Field 1 and Field 2 could explain approximately 51 % and 64 % of the YST variability, respectively. More studies are needed to develop practical and efficient MZ delineation methods and zone-specific management strategies to improve crop productivity, resource use efficiency, economic profitability, and environmental sustainability.
期刊介绍:
Computers and Electronics in Agriculture provides international coverage of advancements in computer hardware, software, electronic instrumentation, and control systems applied to agricultural challenges. Encompassing agronomy, horticulture, forestry, aquaculture, and animal farming, the journal publishes original papers, reviews, and applications notes. It explores the use of computers and electronics in plant or animal agricultural production, covering topics like agricultural soils, water, pests, controlled environments, and waste. The scope extends to on-farm post-harvest operations and relevant technologies, including artificial intelligence, sensors, machine vision, robotics, networking, and simulation modeling. Its companion journal, Smart Agricultural Technology, continues the focus on smart applications in production agriculture.