影响航拍测量树冠覆盖度众包解译一致性的因素

IF 7.3 2区环境科学与生态学 Q1 ECOLOGY

Ecological Informatics Pub Date : 2025-07-10 DOI:10.1016/j.ecoinf.2025.103300

Jill M. Derwin , Valerie A. Thomas , Randolph H. Wynne , Karen G. Schleeweis , John W. Coulston , S. Seth Peery , Kurt Luther , Greg C. Liknes , Stacie Bender , Susmita Sen

{"title":"影响航拍测量树冠覆盖度众包解译一致性的因素","authors":"Jill M. Derwin , Valerie A. Thomas , Randolph H. Wynne , Karen G. Schleeweis , John W. Coulston , S. Seth Peery , Kurt Luther , Greg C. Liknes , Stacie Bender , Susmita Sen","doi":"10.1016/j.ecoinf.2025.103300","DOIUrl":null,"url":null,"abstract":"<div><div>Machine learning models are typically data-hungry algorithms that require large data inputs for training. When they produce wall-to-wall remote sensing products, model validation also requires large sets of temporally harmonized field observations. Crowdsourcing may offer a potential solution for the collection of photointerpretations for the training and validation of spatial models of tree canopy cover (TCC), as it harnesses the power of a large anonymous crowd in the completion of repetitive discrete analyses or human intelligence tasks (HITs). This study explores the factors that determine the consistency of TCC interpretations collected by an anonymous crowd to those collected by a control group. The crowd interpretations were obtained through an anonymous platform with a task-reward framework, while those collected by the control group were collected by known interpreters in a more traditional setting. Both groups carried out this task using an interface developed for Amazon’s Mechanical Turk platform. We collected multiple interpretations at sample plot locations from both crowd and control interpreters, and sampled these data in a Monte Carlo framework to estimate a classification model predicting the consistency of each crowd interpretation with control interpretations. Using this model, we identified the most important variables in estimating the relationship between a location’s characteristics and interpretation behaviors which affect consistency in interpretations between crowd workers our control group. Overall, we show low agreement between crowdsourced and control interpretations, as well as interpretations from individual control group members. This warrants caution in considering the crowdsourced photointerpretation of TCC as a data source for model training and validation without adequate interpreter training as well as significant quality control measures and consistency standards. We show that the number of plots interpreted was the strongest indicator of the reliability of an individual’s interpretations, further evidenced by apparent fatigue effects in crowd interpretations. The second most important variable related to the use of the false color display during interpretation followed by a variable related to the use of the natural color display during interpretation, reflecting the differences in interpretation methodologies used by crowd workers and control group interpreters and the impact display has on the interpretation of tree canopy cover. Finally, we discuss recommendations for further study and future implementations of crowdsourced photointerpretation. These include the enhanced use of existing mechanisms within Mechanical Turk such as worker qualifications to identify and reward more attentive workers, as well as enhanced attention to quality control measures throughout the data collection process and measures to increase intrinsic motivation. For our study we also recommend a minimum time on task or other measures to reduce the punishment of access to HITs for workers who took their time providing detailed interpretations. We also recommend using optimized default interface settings instead of providing a variety of options to the interpreter.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"91 ","pages":"Article 103300"},"PeriodicalIF":7.3000,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Factors influencing the consistency in crowdsourced interpretations of aerial photographs to measure tree canopy cover\",\"authors\":\"Jill M. Derwin , Valerie A. Thomas , Randolph H. Wynne , Karen G. Schleeweis , John W. Coulston , S. Seth Peery , Kurt Luther , Greg C. Liknes , Stacie Bender , Susmita Sen\",\"doi\":\"10.1016/j.ecoinf.2025.103300\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Machine learning models are typically data-hungry algorithms that require large data inputs for training. When they produce wall-to-wall remote sensing products, model validation also requires large sets of temporally harmonized field observations. Crowdsourcing may offer a potential solution for the collection of photointerpretations for the training and validation of spatial models of tree canopy cover (TCC), as it harnesses the power of a large anonymous crowd in the completion of repetitive discrete analyses or human intelligence tasks (HITs). This study explores the factors that determine the consistency of TCC interpretations collected by an anonymous crowd to those collected by a control group. The crowd interpretations were obtained through an anonymous platform with a task-reward framework, while those collected by the control group were collected by known interpreters in a more traditional setting. Both groups carried out this task using an interface developed for Amazon’s Mechanical Turk platform. We collected multiple interpretations at sample plot locations from both crowd and control interpreters, and sampled these data in a Monte Carlo framework to estimate a classification model predicting the consistency of each crowd interpretation with control interpretations. Using this model, we identified the most important variables in estimating the relationship between a location’s characteristics and interpretation behaviors which affect consistency in interpretations between crowd workers our control group. Overall, we show low agreement between crowdsourced and control interpretations, as well as interpretations from individual control group members. This warrants caution in considering the crowdsourced photointerpretation of TCC as a data source for model training and validation without adequate interpreter training as well as significant quality control measures and consistency standards. We show that the number of plots interpreted was the strongest indicator of the reliability of an individual’s interpretations, further evidenced by apparent fatigue effects in crowd interpretations. The second most important variable related to the use of the false color display during interpretation followed by a variable related to the use of the natural color display during interpretation, reflecting the differences in interpretation methodologies used by crowd workers and control group interpreters and the impact display has on the interpretation of tree canopy cover. Finally, we discuss recommendations for further study and future implementations of crowdsourced photointerpretation. These include the enhanced use of existing mechanisms within Mechanical Turk such as worker qualifications to identify and reward more attentive workers, as well as enhanced attention to quality control measures throughout the data collection process and measures to increase intrinsic motivation. For our study we also recommend a minimum time on task or other measures to reduce the punishment of access to HITs for workers who took their time providing detailed interpretations. We also recommend using optimized default interface settings instead of providing a variety of options to the interpreter.</div></div>\",\"PeriodicalId\":51024,\"journal\":{\"name\":\"Ecological Informatics\",\"volume\":\"91 \",\"pages\":\"Article 103300\"},\"PeriodicalIF\":7.3000,\"publicationDate\":\"2025-07-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ecological Informatics\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1574954125003097\",\"RegionNum\":2,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ECOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecological Informatics","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1574954125003097","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

机器学习模型通常是数据饥渴型算法，需要大量数据输入进行训练。当他们生产全面的遥感产品时，模型验证还需要大量的时间协调的现场观测数据。众包可能为收集用于训练和验证树冠覆盖空间模型（TCC）的光解译数据提供潜在的解决方案，因为它利用了大量匿名人群的力量来完成重复的离散分析或人类智能任务（HITs）。本研究探讨了决定匿名人群收集的TCC解释与对照组收集的TCC解释一致性的因素。人群口译是通过一个带有任务奖励框架的匿名平台获得的，而对照组的口译是由已知的口译员在更传统的环境中收集的。两个小组都使用为亚马逊的土耳其机器人平台开发的界面来完成这项任务。我们在样本地块位置收集了来自人群和控制解释者的多个解释，并在蒙特卡罗框架中对这些数据进行采样，以估计预测每个人群解释与控制解释一致性的分类模型。使用该模型，我们确定了在估计地点特征与口译行为之间的关系时最重要的变量，这些行为会影响人群工作者与对照组之间的口译一致性。总体而言，我们发现众包解释和控制解释之间的一致性较低，以及来自个体控制组成员的解释。在没有充分的口译员培训以及重要的质量控制措施和一致性标准的情况下，将TCC的众包照片口译作为模型训练和验证的数据源需要谨慎考虑。我们发现，解释的情节数量是个人解释可靠性的最强指标，进一步证明了群体解释中明显的疲劳效应。第二个最重要的变量与口译过程中使用假色显示有关，其次是与口译过程中使用自然色显示有关的变量，反映了群体工作者和对照组口译员使用的口译方法的差异以及显示对树冠覆盖的口译的影响。最后，我们讨论了进一步研究和未来实现众包光口译的建议。这些措施包括加强使用Mechanical Turk内部的现有机制，例如工人资格，以确定和奖励更细心的工人，以及加强注意整个数据收集过程中的质量控制措施和增加内在动机的措施。对于我们的研究，我们还建议在任务上的最短时间或其他措施，以减少对那些花时间提供详细解释的工人访问HITs的惩罚。我们还建议使用优化的默认接口设置，而不是为解释器提供各种选项。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Factors influencing the consistency in crowdsourced interpretations of aerial photographs to measure tree canopy cover

Machine learning models are typically data-hungry algorithms that require large data inputs for training. When they produce wall-to-wall remote sensing products, model validation also requires large sets of temporally harmonized field observations. Crowdsourcing may offer a potential solution for the collection of photointerpretations for the training and validation of spatial models of tree canopy cover (TCC), as it harnesses the power of a large anonymous crowd in the completion of repetitive discrete analyses or human intelligence tasks (HITs). This study explores the factors that determine the consistency of TCC interpretations collected by an anonymous crowd to those collected by a control group. The crowd interpretations were obtained through an anonymous platform with a task-reward framework, while those collected by the control group were collected by known interpreters in a more traditional setting. Both groups carried out this task using an interface developed for Amazon’s Mechanical Turk platform. We collected multiple interpretations at sample plot locations from both crowd and control interpreters, and sampled these data in a Monte Carlo framework to estimate a classification model predicting the consistency of each crowd interpretation with control interpretations. Using this model, we identified the most important variables in estimating the relationship between a location’s characteristics and interpretation behaviors which affect consistency in interpretations between crowd workers our control group. Overall, we show low agreement between crowdsourced and control interpretations, as well as interpretations from individual control group members. This warrants caution in considering the crowdsourced photointerpretation of TCC as a data source for model training and validation without adequate interpreter training as well as significant quality control measures and consistency standards. We show that the number of plots interpreted was the strongest indicator of the reliability of an individual’s interpretations, further evidenced by apparent fatigue effects in crowd interpretations. The second most important variable related to the use of the false color display during interpretation followed by a variable related to the use of the natural color display during interpretation, reflecting the differences in interpretation methodologies used by crowd workers and control group interpreters and the impact display has on the interpretation of tree canopy cover. Finally, we discuss recommendations for further study and future implementations of crowdsourced photointerpretation. These include the enhanced use of existing mechanisms within Mechanical Turk such as worker qualifications to identify and reward more attentive workers, as well as enhanced attention to quality control measures throughout the data collection process and measures to increase intrinsic motivation. For our study we also recommend a minimum time on task or other measures to reduce the punishment of access to HITs for workers who took their time providing detailed interpretations. We also recommend using optimized default interface settings instead of providing a variety of options to the interpreter.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Ecological Informatics 环境科学-生态学

CiteScore

8.30

自引率

11.80%

发文量

346

审稿时长

46 days

期刊介绍： The journal Ecological Informatics is devoted to the publication of high quality, peer-reviewed articles on all aspects of computational ecology, data science and biogeography. The scope of the journal takes into account the data-intensive nature of ecology, the growing capacity of information technology to access, harness and leverage complex data as well as the critical need for informing sustainable management in view of global environmental and climate change. The nature of the journal is interdisciplinary at the crossover between ecology and informatics. It focuses on novel concepts and techniques for image- and genome-based monitoring and interpretation, sensor- and multimedia-based data acquisition, internet-based data archiving and sharing, data assimilation, modelling and prediction of ecological data.