Jill M. Derwin , Valerie A. Thomas , Randolph H. Wynne , Karen G. Schleeweis , John W. Coulston , S. Seth Peery , Kurt Luther , Greg C. Liknes , Stacie Bender , Susmita Sen
{"title":"影响航拍测量树冠覆盖度众包解译一致性的因素","authors":"Jill M. Derwin , Valerie A. Thomas , Randolph H. Wynne , Karen G. Schleeweis , John W. Coulston , S. Seth Peery , Kurt Luther , Greg C. Liknes , Stacie Bender , Susmita Sen","doi":"10.1016/j.ecoinf.2025.103300","DOIUrl":null,"url":null,"abstract":"<div><div>Machine learning models are typically data-hungry algorithms that require large data inputs for training. When they produce wall-to-wall remote sensing products, model validation also requires large sets of temporally harmonized field observations. Crowdsourcing may offer a potential solution for the collection of photointerpretations for the training and validation of spatial models of tree canopy cover (TCC), as it harnesses the power of a large anonymous crowd in the completion of repetitive discrete analyses or human intelligence tasks (HITs). This study explores the factors that determine the consistency of TCC interpretations collected by an anonymous crowd to those collected by a control group. The crowd interpretations were obtained through an anonymous platform with a task-reward framework, while those collected by the control group were collected by known interpreters in a more traditional setting. Both groups carried out this task using an interface developed for Amazon’s Mechanical Turk platform. We collected multiple interpretations at sample plot locations from both crowd and control interpreters, and sampled these data in a Monte Carlo framework to estimate a classification model predicting the consistency of each crowd interpretation with control interpretations. Using this model, we identified the most important variables in estimating the relationship between a location’s characteristics and interpretation behaviors which affect consistency in interpretations between crowd workers our control group. Overall, we show low agreement between crowdsourced and control interpretations, as well as interpretations from individual control group members. This warrants caution in considering the crowdsourced photointerpretation of TCC as a data source for model training and validation without adequate interpreter training as well as significant quality control measures and consistency standards. We show that the number of plots interpreted was the strongest indicator of the reliability of an individual’s interpretations, further evidenced by apparent fatigue effects in crowd interpretations. The second most important variable related to the use of the false color display during interpretation followed by a variable related to the use of the natural color display during interpretation, reflecting the differences in interpretation methodologies used by crowd workers and control group interpreters and the impact display has on the interpretation of tree canopy cover. Finally, we discuss recommendations for further study and future implementations of crowdsourced photointerpretation. These include the enhanced use of existing mechanisms within Mechanical Turk such as worker qualifications to identify and reward more attentive workers, as well as enhanced attention to quality control measures throughout the data collection process and measures to increase intrinsic motivation. For our study we also recommend a minimum time on task or other measures to reduce the punishment of access to HITs for workers who took their time providing detailed interpretations. We also recommend using optimized default interface settings instead of providing a variety of options to the interpreter.</div></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":"91 ","pages":"Article 103300"},"PeriodicalIF":7.3000,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Factors influencing the consistency in crowdsourced interpretations of aerial photographs to measure tree canopy cover\",\"authors\":\"Jill M. Derwin , Valerie A. Thomas , Randolph H. Wynne , Karen G. Schleeweis , John W. Coulston , S. Seth Peery , Kurt Luther , Greg C. Liknes , Stacie Bender , Susmita Sen\",\"doi\":\"10.1016/j.ecoinf.2025.103300\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Machine learning models are typically data-hungry algorithms that require large data inputs for training. When they produce wall-to-wall remote sensing products, model validation also requires large sets of temporally harmonized field observations. Crowdsourcing may offer a potential solution for the collection of photointerpretations for the training and validation of spatial models of tree canopy cover (TCC), as it harnesses the power of a large anonymous crowd in the completion of repetitive discrete analyses or human intelligence tasks (HITs). This study explores the factors that determine the consistency of TCC interpretations collected by an anonymous crowd to those collected by a control group. The crowd interpretations were obtained through an anonymous platform with a task-reward framework, while those collected by the control group were collected by known interpreters in a more traditional setting. Both groups carried out this task using an interface developed for Amazon’s Mechanical Turk platform. We collected multiple interpretations at sample plot locations from both crowd and control interpreters, and sampled these data in a Monte Carlo framework to estimate a classification model predicting the consistency of each crowd interpretation with control interpretations. Using this model, we identified the most important variables in estimating the relationship between a location’s characteristics and interpretation behaviors which affect consistency in interpretations between crowd workers our control group. Overall, we show low agreement between crowdsourced and control interpretations, as well as interpretations from individual control group members. This warrants caution in considering the crowdsourced photointerpretation of TCC as a data source for model training and validation without adequate interpreter training as well as significant quality control measures and consistency standards. We show that the number of plots interpreted was the strongest indicator of the reliability of an individual’s interpretations, further evidenced by apparent fatigue effects in crowd interpretations. The second most important variable related to the use of the false color display during interpretation followed by a variable related to the use of the natural color display during interpretation, reflecting the differences in interpretation methodologies used by crowd workers and control group interpreters and the impact display has on the interpretation of tree canopy cover. Finally, we discuss recommendations for further study and future implementations of crowdsourced photointerpretation. These include the enhanced use of existing mechanisms within Mechanical Turk such as worker qualifications to identify and reward more attentive workers, as well as enhanced attention to quality control measures throughout the data collection process and measures to increase intrinsic motivation. For our study we also recommend a minimum time on task or other measures to reduce the punishment of access to HITs for workers who took their time providing detailed interpretations. We also recommend using optimized default interface settings instead of providing a variety of options to the interpreter.</div></div>\",\"PeriodicalId\":51024,\"journal\":{\"name\":\"Ecological Informatics\",\"volume\":\"91 \",\"pages\":\"Article 103300\"},\"PeriodicalIF\":7.3000,\"publicationDate\":\"2025-07-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ecological Informatics\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1574954125003097\",\"RegionNum\":2,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ECOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecological Informatics","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1574954125003097","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}
Factors influencing the consistency in crowdsourced interpretations of aerial photographs to measure tree canopy cover
Machine learning models are typically data-hungry algorithms that require large data inputs for training. When they produce wall-to-wall remote sensing products, model validation also requires large sets of temporally harmonized field observations. Crowdsourcing may offer a potential solution for the collection of photointerpretations for the training and validation of spatial models of tree canopy cover (TCC), as it harnesses the power of a large anonymous crowd in the completion of repetitive discrete analyses or human intelligence tasks (HITs). This study explores the factors that determine the consistency of TCC interpretations collected by an anonymous crowd to those collected by a control group. The crowd interpretations were obtained through an anonymous platform with a task-reward framework, while those collected by the control group were collected by known interpreters in a more traditional setting. Both groups carried out this task using an interface developed for Amazon’s Mechanical Turk platform. We collected multiple interpretations at sample plot locations from both crowd and control interpreters, and sampled these data in a Monte Carlo framework to estimate a classification model predicting the consistency of each crowd interpretation with control interpretations. Using this model, we identified the most important variables in estimating the relationship between a location’s characteristics and interpretation behaviors which affect consistency in interpretations between crowd workers our control group. Overall, we show low agreement between crowdsourced and control interpretations, as well as interpretations from individual control group members. This warrants caution in considering the crowdsourced photointerpretation of TCC as a data source for model training and validation without adequate interpreter training as well as significant quality control measures and consistency standards. We show that the number of plots interpreted was the strongest indicator of the reliability of an individual’s interpretations, further evidenced by apparent fatigue effects in crowd interpretations. The second most important variable related to the use of the false color display during interpretation followed by a variable related to the use of the natural color display during interpretation, reflecting the differences in interpretation methodologies used by crowd workers and control group interpreters and the impact display has on the interpretation of tree canopy cover. Finally, we discuss recommendations for further study and future implementations of crowdsourced photointerpretation. These include the enhanced use of existing mechanisms within Mechanical Turk such as worker qualifications to identify and reward more attentive workers, as well as enhanced attention to quality control measures throughout the data collection process and measures to increase intrinsic motivation. For our study we also recommend a minimum time on task or other measures to reduce the punishment of access to HITs for workers who took their time providing detailed interpretations. We also recommend using optimized default interface settings instead of providing a variety of options to the interpreter.
期刊介绍:
The journal Ecological Informatics is devoted to the publication of high quality, peer-reviewed articles on all aspects of computational ecology, data science and biogeography. The scope of the journal takes into account the data-intensive nature of ecology, the growing capacity of information technology to access, harness and leverage complex data as well as the critical need for informing sustainable management in view of global environmental and climate change.
The nature of the journal is interdisciplinary at the crossover between ecology and informatics. It focuses on novel concepts and techniques for image- and genome-based monitoring and interpretation, sensor- and multimedia-based data acquisition, internet-based data archiving and sharing, data assimilation, modelling and prediction of ecological data.