{"title":"Improving Yield Data Analysis Using Contextual Data","authors":"Elizabeth M. Hawkins, Dennis R. Buckmaster","doi":"10.13031/aea.14655","DOIUrl":null,"url":null,"abstract":"Highlights Context-driven yield data cleaning resulted in more accurate whole field yield estimates Using a context-driven yield data cleaning method can improve yield estimates for zones within fields Identifying error-prone areas in field where data quality is likely to be low and removing that data in bulk can reduce data cleaning bias Abstract. As agriculture becomes more data driven, decision-making has become the focus of the industry and data quality will be increasingly important. Traditionally, yield data cleaning techniques have removed individual data points based on criteria primarily focused on the yield values themselves. However, when these methods are used, the underlying causes of the errors are often overlooked and as a result, these techniques may fail to remove all of the inaccurate (error-prone) data and/or remove legitimate data. In this research, an alternative to data cleaning was developed. Data integrity zones (DIZ) within each field were identified by evaluating metadata which included data collected by the combine that reported the operating conditions of the machinery (i.e., travel speed, crop mass flow), data about the field environment (i.e., soil type, topography, weather), and data of field operations (e.g., field logs, as-applied maps). Data in DIZ were isolated using buffers and the analysis of the reduced datasets was compared to the raw data. The amount of data removed depended on the amount of variability (e.g. soil characteristics, topography) in the field. Statistical comparisons of the data showed the mean yield estimates for soil type polygons increased by an average of 1.4 Mg/ha for corn when DIZ data was used compared to raw data. On average, the confidence around the mean remains similar even with a large amount (70%) of data removed. Notably, the none of the mean estimates derived from raw datasets were contained in the confidence intervals produced from DIZ data. This meta-data (context-driven) alternative to data cleaning effectively removed errors and artifacts from yield data which would only be identified when looking beyond the yield measurements themselves. When similarly reduced datasets are used to analyze historical yield data, they should provide a clearer picture of true yield effects of treatments, management zones, soil types, etc.; this will improve decisions on input and resource allocation, support wiser adoption of precision agricultural technologies, and refine future data collection. Keywords: Combine yield monitor, Context, Data analysis, Integrity zones, Management zones, Metadata, Precision agriculture, Yield, Yield data.","PeriodicalId":55501,"journal":{"name":"Applied Engineering in Agriculture","volume":null,"pages":null},"PeriodicalIF":0.8000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Engineering in Agriculture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.13031/aea.14655","RegionNum":4,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Highlights Context-driven yield data cleaning resulted in more accurate whole field yield estimates Using a context-driven yield data cleaning method can improve yield estimates for zones within fields Identifying error-prone areas in field where data quality is likely to be low and removing that data in bulk can reduce data cleaning bias Abstract. As agriculture becomes more data driven, decision-making has become the focus of the industry and data quality will be increasingly important. Traditionally, yield data cleaning techniques have removed individual data points based on criteria primarily focused on the yield values themselves. However, when these methods are used, the underlying causes of the errors are often overlooked and as a result, these techniques may fail to remove all of the inaccurate (error-prone) data and/or remove legitimate data. In this research, an alternative to data cleaning was developed. Data integrity zones (DIZ) within each field were identified by evaluating metadata which included data collected by the combine that reported the operating conditions of the machinery (i.e., travel speed, crop mass flow), data about the field environment (i.e., soil type, topography, weather), and data of field operations (e.g., field logs, as-applied maps). Data in DIZ were isolated using buffers and the analysis of the reduced datasets was compared to the raw data. The amount of data removed depended on the amount of variability (e.g. soil characteristics, topography) in the field. Statistical comparisons of the data showed the mean yield estimates for soil type polygons increased by an average of 1.4 Mg/ha for corn when DIZ data was used compared to raw data. On average, the confidence around the mean remains similar even with a large amount (70%) of data removed. Notably, the none of the mean estimates derived from raw datasets were contained in the confidence intervals produced from DIZ data. This meta-data (context-driven) alternative to data cleaning effectively removed errors and artifacts from yield data which would only be identified when looking beyond the yield measurements themselves. When similarly reduced datasets are used to analyze historical yield data, they should provide a clearer picture of true yield effects of treatments, management zones, soil types, etc.; this will improve decisions on input and resource allocation, support wiser adoption of precision agricultural technologies, and refine future data collection. Keywords: Combine yield monitor, Context, Data analysis, Integrity zones, Management zones, Metadata, Precision agriculture, Yield, Yield data.
期刊介绍:
This peer-reviewed journal publishes applications of engineering and technology research that address agricultural, food, and biological systems problems. Submissions must include results of practical experiences, tests, or trials presented in a manner and style that will allow easy adaptation by others; results of reviews or studies of installations or applications with substantially new or significant information not readily available in other refereed publications; or a description of successful methods of techniques of education, outreach, or technology transfer.