使用众包平台进行调查研究的初学者指南和最佳实践:亚马逊土耳其机器人(MTurk)案例

Journal of Global Business Insights Pub Date : 2021-03-01 DOI:10.5038/2640-6489.6.1.1177

C. Cobanoglu, Muhittin Cavusoglu, Gozde Turktarhan

{"title":"使用众包平台进行调查研究的初学者指南和最佳实践:亚马逊土耳其机器人(MTurk)案例","authors":"C. Cobanoglu, Muhittin Cavusoglu, Gozde Turktarhan","doi":"10.5038/2640-6489.6.1.1177","DOIUrl":null,"url":null,"abstract":"\n Introduction\n \n Researchers around the globe are utilizing crowdsourcing tools to reach respondents for quantitative and qualitative research (Chambers & Nimon, 2019). Many social science and business journals are receiving studies that utilize crowdsourcing tools such as Amazon Mechanical Turk (MTurk), Qualtrics, MicroWorkers, ShortTask, ClickWorker, and Crowdsource (e.g., Ahn, & Back, 2019; Ali et al., 2021; Esfahani, & Ozturk, 2019; Jeong, & Lee, 2017; Zhang et al., 2017). Even though the use of these tools presents a great opportunity for sharing large quantities of data quickly, some challenges must also be addressed. The purpose of this guide is to present the basic ideas behind the use of crowdsourcing for survey research and provide a primer for best practices that will increase their validity and reliability.\n \n What is crowdsourcing research?\n \n Crowdsourcing describes the collection of information, opinions, or other types of input from a large number of people, typically via the internet, and which may or may not receive (financial) compensation (Hargrave, 2019; Oxford Dictionary, n.d.). Within the behavioral science realm, crowdsourcing is defined as the use of internet services for hosting research activities and for creating opportunities for a large population of participants. Applications of crowdsourcing techniques have evolved over the decades, establishing the strong informational power of crowds. The advent of Web 2.0 has expanded the possibilities of crowdsourcing, with new online tools such as online reviews, forums, Wikipedia, Qualtrics, or MTurk, but also other platforms such as Crowdflower and Prolific Academic (Peer et al., 2017; Sheehan, 2018).\n Crowdsourcing platforms in the age of Web 2.0 use remote labor recruited via the internet to assist employers complete tasks that cannot be left to machines. Key characteristics of crowdsourcing include payment for workers, their recruitment from any location, and the completion of tasks (Behrend et al., 2011). They also allow for a relatively quick collection of data compared to data collection in the field, and participants are rewarded with an incentive—often financial compensation. Crowdsourcing not only offers a large participation pool but also a streamlined process for the study design, participant recruitment, and data collection as well as integrated participant compensation system (Buhrmester et al., 2011). Also, compared to other traditional marketing firms, crowdsourcing makes it easier to detect possible sampling biases (Garrow et al., 2020). Due to advantages such as reduced costs, diversity of participants, and flexibility, crowdsourcing platforms have surged in popularity for researchers.\n \n Advantages\n \n MTurk is one of the most popular crowdsourcing platforms among researchers, allowing Requesters to submit tasks for Workers to complete (Cummings & Sibona, 2017). MTurk has been used as an online crowdsourcing platform for the recruitment of human subjects for research purposes (Paolacci & Chandler, 2014). Research has also shown MTurk to be a reliable and cost-effective tool, capable of providing representative data for research in the behavioral sciences (e.g., Crump et al., 2013; Goodman et al., 2013; Mason & Suri, 2012; Rand, 2012; Simcox & Fiez, 2014). In addition to its use in social science studies, the platform has been used in marketing, hospitality and tourism, psychology, political science, communication, and sociology contexts (Sheehan, 2018). To illustrate, between 2012 and 2017, more than 40% of the studies published in the Journal of Consumer Research used crowdsourcing websites for their data collection (Goodman & Paolacci, 2017).\n \n Disadvantages\n \n Although researchers have assessed crowdsourcing platforms as reliable and cost-effective for data collection in the behavioral sciences, they are not exempt of flaws. One disadvantage is the possibility of unsatisfactory data quality. In fact, the virtual setting of the survey implies that the investigator is physically separated from the participant, and this lack of monitoring could lead to data quality issues (Sheehan, 2018). In addition, participants in survey research on crowdsourcing platforms are not always who they claim to be, creating issues of trust with the data provided and, ultimately, the quality of the research findings (McGonagle, 2015; Smith et al., 2016).\n A recurrent concern with MTurk workers, for instance, is their assessment as experienced survey takers (Chandler et al., 2015). This experience is mainly acquired through completion of dozens of surveys per day, especially when they are faced with similar items and scales. Smith et al. (2016) identified two types of problems performing data collection using MTurk; namely, cheaters and speeders. As compared to Qualtrics—which has a strict screening and quality-control processes to ensure that participants are who they claim to be—MTurk appears to be less exigent regarding the workers. However, a downside for data collection with Qualtrics is more expensive fees—about $5.00 per questionnaire on Qualtrics, against $0.50 to $1.50 on MTurk (Ford, 2017). Hence, few researchers were able to conduct surveys and compare respondent pools with Qualtrics or other traditional marketing research firms (Garrow et al., 2020).\n Another challenge using MTurk arises when trying to collect a desired number of responses from a population targeted to a specific city or area (Ross et al., 2010). The issues inherent to the selection process of MTurk have been the subject of investigations in several studies (e.g., Berinsky et al., 2012; Chandler et al., 2014; 2015; Harms & DeSimone, 2015; Paolacci et al., 2010; Rand, 2012). Feitosa et al. (2015) pointed out that international respondents may still identify themselves as U.S. respondents with the use of fake addresses and accounts. They found that 5% to 10% of participants identifying themselves as U.S. respondents were actually from overseas locations. Moreover, Babin et al. (2016) assessed that the use of trap questions allowed researchers to uncover that many respondents change their genders, ages, careers, or income within the course of a single survey. The issues of (a) experienced workers for the quality control of questions and (b) speeders, which, for MTurk can be attributed to the platform being the main source of revenue for a given respondent, remain the inherent issues of crowdsourcing platforms used for research purposes.\n \n Best practices\n \n Some best practices can be recommended in the use of crowdsourcing platforms for data collection purposes. Workers IDs can be matched with IDs from previous studies, thus allowing researchers to exclude responses from workers who had answered previous similar studies (Goodman & Paolacci, 2017). Furthermore, proceed to a manual assignment of qualification on MTurk prior to data collection (Litman et al., 2015; Park & Park, 2020). When dealing with experienced workers, both using multiple attention checks and optimizing the survey in a way to have the participants exposed to the stimuli for a sufficient length of time to better address the questions are also recommended (Sheehan, 2018). In this sense, shorter surveys are preferred to longer ones, which affect the participant’s concentration, and may, in turn, adversely impact the quality of their answers. Most importantly, pretest the survey to make sure that all parts are working as expected.\n Researchers should also keep in mind that in the context of MTurk, the primary method for measurement is the web interface. Thus, to avoid method biases, researchers should ponder whether or not method factors emerge in the latent measurement models (Podsakoff et al., 2012). As such, time-lagged research designs may be preferred as predictor and criterion variables can be measured at different points in time or administered in different platforms, such as Qualtrics vs MTurk (Cheung et al., 2017). In general, the use of crowdsourcing platforms including MTurk may be appropriate according to the research question; and the quality of data is reliant on the quality-control strategies used by researchers to enhance data quality. Trade-offs between various validity types need to be prioritized according to the research objectives (Cheung et al., 2017).\n From our experience using crowdsourcing tools for our own research as the editorial team members of several journals and chair of several conferences, we provide the best practices as outlined below:\n \n MTurk Worker (Respondent) Selection:\n \n \n Researchers should consider their study population before using MTurk for data collection. The MTurk platform should be used for the appropriate study population. For example, if the study targets restaurant owners or company CEOs, MTurk workers may not be suitable for the study. However, if the target population is diners, hotel guests, grocery shoppers, online shoppers, students, or hourly employees, utilizing a sample from MTurk would be suitable.\n \n Researchers should use the selection tool in the software. For example, if you target workers only from one country, exclude responses that came from an internet protocol (IP) address outside the targeted country and report the results in the method section.\n Researchers should consider the demographics of workers on MTurk which must reflect the study targeted population. For example, if the study focuses on baby boomers use of technology, then the MTurk sample should include only baby boomers. Similarly, the gender balance, racial composition, and income of people on MTurk should mirror the targeted population.\n Researchers should use multiple screening tools that identify quality respondents and avoid problematic response patterns. For example, MTurk provides the approval rate for the respondents. This refers to how many times a respondent is rejected for various reasons (i.e., wrong code entered). We recommend using a 90% or higher approv","PeriodicalId":448415,"journal":{"name":"Journal of Global Business Insights","volume":"78 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":"{\"title\":\"A beginner’s guide and best practices for using crowdsourcing platforms for survey research: The Case of Amazon Mechanical Turk (MTurk)\",\"authors\":\"C. Cobanoglu, Muhittin Cavusoglu, Gozde Turktarhan\",\"doi\":\"10.5038/2640-6489.6.1.1177\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n Introduction\\n \\n Researchers around the globe are utilizing crowdsourcing tools to reach respondents for quantitative and qualitative research (Chambers & Nimon, 2019). Many social science and business journals are receiving studies that utilize crowdsourcing tools such as Amazon Mechanical Turk (MTurk), Qualtrics, MicroWorkers, ShortTask, ClickWorker, and Crowdsource (e.g., Ahn, & Back, 2019; Ali et al., 2021; Esfahani, & Ozturk, 2019; Jeong, & Lee, 2017; Zhang et al., 2017). Even though the use of these tools presents a great opportunity for sharing large quantities of data quickly, some challenges must also be addressed. The purpose of this guide is to present the basic ideas behind the use of crowdsourcing for survey research and provide a primer for best practices that will increase their validity and reliability.\\n \\n What is crowdsourcing research?\\n \\n Crowdsourcing describes the collection of information, opinions, or other types of input from a large number of people, typically via the internet, and which may or may not receive (financial) compensation (Hargrave, 2019; Oxford Dictionary, n.d.). Within the behavioral science realm, crowdsourcing is defined as the use of internet services for hosting research activities and for creating opportunities for a large population of participants. Applications of crowdsourcing techniques have evolved over the decades, establishing the strong informational power of crowds. The advent of Web 2.0 has expanded the possibilities of crowdsourcing, with new online tools such as online reviews, forums, Wikipedia, Qualtrics, or MTurk, but also other platforms such as Crowdflower and Prolific Academic (Peer et al., 2017; Sheehan, 2018).\\n Crowdsourcing platforms in the age of Web 2.0 use remote labor recruited via the internet to assist employers complete tasks that cannot be left to machines. Key characteristics of crowdsourcing include payment for workers, their recruitment from any location, and the completion of tasks (Behrend et al., 2011). They also allow for a relatively quick collection of data compared to data collection in the field, and participants are rewarded with an incentive—often financial compensation. Crowdsourcing not only offers a large participation pool but also a streamlined process for the study design, participant recruitment, and data collection as well as integrated participant compensation system (Buhrmester et al., 2011). Also, compared to other traditional marketing firms, crowdsourcing makes it easier to detect possible sampling biases (Garrow et al., 2020). Due to advantages such as reduced costs, diversity of participants, and flexibility, crowdsourcing platforms have surged in popularity for researchers.\\n \\n Advantages\\n \\n MTurk is one of the most popular crowdsourcing platforms among researchers, allowing Requesters to submit tasks for Workers to complete (Cummings & Sibona, 2017). MTurk has been used as an online crowdsourcing platform for the recruitment of human subjects for research purposes (Paolacci & Chandler, 2014). Research has also shown MTurk to be a reliable and cost-effective tool, capable of providing representative data for research in the behavioral sciences (e.g., Crump et al., 2013; Goodman et al., 2013; Mason & Suri, 2012; Rand, 2012; Simcox & Fiez, 2014). In addition to its use in social science studies, the platform has been used in marketing, hospitality and tourism, psychology, political science, communication, and sociology contexts (Sheehan, 2018). To illustrate, between 2012 and 2017, more than 40% of the studies published in the Journal of Consumer Research used crowdsourcing websites for their data collection (Goodman & Paolacci, 2017).\\n \\n Disadvantages\\n \\n Although researchers have assessed crowdsourcing platforms as reliable and cost-effective for data collection in the behavioral sciences, they are not exempt of flaws. One disadvantage is the possibility of unsatisfactory data quality. In fact, the virtual setting of the survey implies that the investigator is physically separated from the participant, and this lack of monitoring could lead to data quality issues (Sheehan, 2018). In addition, participants in survey research on crowdsourcing platforms are not always who they claim to be, creating issues of trust with the data provided and, ultimately, the quality of the research findings (McGonagle, 2015; Smith et al., 2016).\\n A recurrent concern with MTurk workers, for instance, is their assessment as experienced survey takers (Chandler et al., 2015). This experience is mainly acquired through completion of dozens of surveys per day, especially when they are faced with similar items and scales. Smith et al. (2016) identified two types of problems performing data collection using MTurk; namely, cheaters and speeders. As compared to Qualtrics—which has a strict screening and quality-control processes to ensure that participants are who they claim to be—MTurk appears to be less exigent regarding the workers. However, a downside for data collection with Qualtrics is more expensive fees—about $5.00 per questionnaire on Qualtrics, against $0.50 to $1.50 on MTurk (Ford, 2017). Hence, few researchers were able to conduct surveys and compare respondent pools with Qualtrics or other traditional marketing research firms (Garrow et al., 2020).\\n Another challenge using MTurk arises when trying to collect a desired number of responses from a population targeted to a specific city or area (Ross et al., 2010). The issues inherent to the selection process of MTurk have been the subject of investigations in several studies (e.g., Berinsky et al., 2012; Chandler et al., 2014; 2015; Harms & DeSimone, 2015; Paolacci et al., 2010; Rand, 2012). Feitosa et al. (2015) pointed out that international respondents may still identify themselves as U.S. respondents with the use of fake addresses and accounts. They found that 5% to 10% of participants identifying themselves as U.S. respondents were actually from overseas locations. Moreover, Babin et al. (2016) assessed that the use of trap questions allowed researchers to uncover that many respondents change their genders, ages, careers, or income within the course of a single survey. The issues of (a) experienced workers for the quality control of questions and (b) speeders, which, for MTurk can be attributed to the platform being the main source of revenue for a given respondent, remain the inherent issues of crowdsourcing platforms used for research purposes.\\n \\n Best practices\\n \\n Some best practices can be recommended in the use of crowdsourcing platforms for data collection purposes. Workers IDs can be matched with IDs from previous studies, thus allowing researchers to exclude responses from workers who had answered previous similar studies (Goodman & Paolacci, 2017). Furthermore, proceed to a manual assignment of qualification on MTurk prior to data collection (Litman et al., 2015; Park & Park, 2020). When dealing with experienced workers, both using multiple attention checks and optimizing the survey in a way to have the participants exposed to the stimuli for a sufficient length of time to better address the questions are also recommended (Sheehan, 2018). In this sense, shorter surveys are preferred to longer ones, which affect the participant’s concentration, and may, in turn, adversely impact the quality of their answers. Most importantly, pretest the survey to make sure that all parts are working as expected.\\n Researchers should also keep in mind that in the context of MTurk, the primary method for measurement is the web interface. Thus, to avoid method biases, researchers should ponder whether or not method factors emerge in the latent measurement models (Podsakoff et al., 2012). As such, time-lagged research designs may be preferred as predictor and criterion variables can be measured at different points in time or administered in different platforms, such as Qualtrics vs MTurk (Cheung et al., 2017). In general, the use of crowdsourcing platforms including MTurk may be appropriate according to the research question; and the quality of data is reliant on the quality-control strategies used by researchers to enhance data quality. Trade-offs between various validity types need to be prioritized according to the research objectives (Cheung et al., 2017).\\n From our experience using crowdsourcing tools for our own research as the editorial team members of several journals and chair of several conferences, we provide the best practices as outlined below:\\n \\n MTurk Worker (Respondent) Selection:\\n \\n \\n Researchers should consider their study population before using MTurk for data collection. The MTurk platform should be used for the appropriate study population. For example, if the study targets restaurant owners or company CEOs, MTurk workers may not be suitable for the study. However, if the target population is diners, hotel guests, grocery shoppers, online shoppers, students, or hourly employees, utilizing a sample from MTurk would be suitable.\\n \\n Researchers should use the selection tool in the software. For example, if you target workers only from one country, exclude responses that came from an internet protocol (IP) address outside the targeted country and report the results in the method section.\\n Researchers should consider the demographics of workers on MTurk which must reflect the study targeted population. For example, if the study focuses on baby boomers use of technology, then the MTurk sample should include only baby boomers. Similarly, the gender balance, racial composition, and income of people on MTurk should mirror the targeted population.\\n Researchers should use multiple screening tools that identify quality respondents and avoid problematic response patterns. For example, MTurk provides the approval rate for the respondents. This refers to how many times a respondent is rejected for various reasons (i.e., wrong code entered). We recommend using a 90% or higher approv\",\"PeriodicalId\":448415,\"journal\":{\"name\":\"Journal of Global Business Insights\",\"volume\":\"78 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"33\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Global Business Insights\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5038/2640-6489.6.1.1177\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Global Business Insights","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5038/2640-6489.6.1.1177","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 33

摘要

然而，使用Qualtrics收集数据的一个缺点是更昂贵的费用——Qualtrics上每份问卷约为5.00美元，而MTurk上每份问卷为0.50美元至1.50美元(Ford, 2017)。因此，很少有研究人员能够进行调查，并将受访者池与Qualtrics或其他传统营销研究公司进行比较(Garrow等人，2020)。当试图从针对特定城市或地区的人群中收集所需数量的回复时，使用MTurk的另一个挑战就出现了(Ross等人，2010)。MTurk选择过程中固有的问题已经成为几项研究的调查主题(例如，Berinsky等人，2012;Chandler et al.， 2014;2015;Harms & DeSimone, 2015;Paolacci et al.， 2010;兰德,2012)。Feitosa et al.(2015)指出，国际受访者可能仍然会通过使用虚假地址和账户来识别自己是美国受访者。他们发现，自称为美国人的受访者中有5%至10%实际上来自海外。此外，Babin等人(2016)评估说，陷阱问题的使用使研究人员发现，许多受访者在一次调查过程中改变了他们的性别、年龄、职业或收入。(a)有经验的工作人员对问题的质量控制和(b)速度的问题，对于MTurk来说，这可以归因于平台是给定受访者的主要收入来源，仍然是用于研究目的的众包平台的固有问题。在使用众包平台进行数据收集方面，可以推荐一些最佳做法。工人id可以与以前研究中的id匹配，从而允许研究人员排除回答过以前类似研究的工人的回答(Goodman & Paolacci, 2017)。此外，在数据收集之前，在MTurk上进行手动资格分配(Litman et al.， 2015;Park & Park, 2020)。在与有经验的工人打交道时，也建议使用多次注意力检查和优化调查，让参与者在足够长的时间内接触刺激，以更好地解决问题(Sheehan, 2018)。从这个意义上说，较短的调查比较长的调查更受欢迎，这会影响参与者的注意力，并可能反过来对他们的回答质量产生不利影响。最重要的是，预先测试调查，以确保所有部分都按预期工作。研究人员还应该记住，在MTurk的背景下，主要的测量方法是网络界面。因此，为了避免方法偏差，研究人员应该考虑潜在测量模型中是否存在方法因素(Podsakoff et al.， 2012)。因此，时间滞后的研究设计可能是首选，因为预测变量和标准变量可以在不同的时间点测量或在不同的平台上管理，例如Qualtrics vs MTurk (Cheung et al.， 2017)。一般来说，根据研究问题，使用包括MTurk在内的众包平台可能是合适的;数据的质量取决于研究人员为提高数据质量而采用的质量控制策略。不同效度类型之间的权衡需要根据研究目标进行优先排序(Cheung et al.， 2017)。根据我们作为几个期刊的编辑团队成员和几个会议的主席在我们自己的研究中使用众包工具的经验，我们提供了如下概述的最佳实践:MTurk工作者(受访者)选择:研究人员在使用MTurk进行数据收集之前应该考虑他们的研究人群。MTurk平台应用于适当的研究人群。例如，如果这项研究的目标是餐馆老板或公司首席执行官，土耳其工人可能不适合这项研究。然而，如果目标人群是食客、酒店客人、杂货店购物者、网上购物者、学生或小时工，那么使用MTurk的样本将是合适的。研究人员应该使用软件中的选择工具。例如，如果您只针对来自一个国家的工作人员，则排除来自目标国家以外的互联网协议(IP)地址的响应，并在方法部分中报告结果。研究人员应该考虑MTurk上工人的人口统计数据，这必须反映研究的目标人群。例如，如果研究的重点是婴儿潮一代对技术的使用，那么MTurk样本应该只包括婴儿潮一代。同样，MTurk上的性别平衡、种族构成和收入也应反映目标人口。研究人员应该使用多种筛选工具来识别高质量的受访者，并避免有问题的回应模式。例如，MTurk提供了受访者的支持率。这是指应答者因各种原因(例如，输入错误的代码)而被拒绝的次数。我们建议使用90%或更高的支持率

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A beginner’s guide and best practices for using crowdsourcing platforms for survey research: The Case of Amazon Mechanical Turk (MTurk)

Introduction Researchers around the globe are utilizing crowdsourcing tools to reach respondents for quantitative and qualitative research (Chambers & Nimon, 2019). Many social science and business journals are receiving studies that utilize crowdsourcing tools such as Amazon Mechanical Turk (MTurk), Qualtrics, MicroWorkers, ShortTask, ClickWorker, and Crowdsource (e.g., Ahn, & Back, 2019; Ali et al., 2021; Esfahani, & Ozturk, 2019; Jeong, & Lee, 2017; Zhang et al., 2017). Even though the use of these tools presents a great opportunity for sharing large quantities of data quickly, some challenges must also be addressed. The purpose of this guide is to present the basic ideas behind the use of crowdsourcing for survey research and provide a primer for best practices that will increase their validity and reliability. What is crowdsourcing research? Crowdsourcing describes the collection of information, opinions, or other types of input from a large number of people, typically via the internet, and which may or may not receive (financial) compensation (Hargrave, 2019; Oxford Dictionary, n.d.). Within the behavioral science realm, crowdsourcing is defined as the use of internet services for hosting research activities and for creating opportunities for a large population of participants. Applications of crowdsourcing techniques have evolved over the decades, establishing the strong informational power of crowds. The advent of Web 2.0 has expanded the possibilities of crowdsourcing, with new online tools such as online reviews, forums, Wikipedia, Qualtrics, or MTurk, but also other platforms such as Crowdflower and Prolific Academic (Peer et al., 2017; Sheehan, 2018). Crowdsourcing platforms in the age of Web 2.0 use remote labor recruited via the internet to assist employers complete tasks that cannot be left to machines. Key characteristics of crowdsourcing include payment for workers, their recruitment from any location, and the completion of tasks (Behrend et al., 2011). They also allow for a relatively quick collection of data compared to data collection in the field, and participants are rewarded with an incentive—often financial compensation. Crowdsourcing not only offers a large participation pool but also a streamlined process for the study design, participant recruitment, and data collection as well as integrated participant compensation system (Buhrmester et al., 2011). Also, compared to other traditional marketing firms, crowdsourcing makes it easier to detect possible sampling biases (Garrow et al., 2020). Due to advantages such as reduced costs, diversity of participants, and flexibility, crowdsourcing platforms have surged in popularity for researchers. Advantages MTurk is one of the most popular crowdsourcing platforms among researchers, allowing Requesters to submit tasks for Workers to complete (Cummings & Sibona, 2017). MTurk has been used as an online crowdsourcing platform for the recruitment of human subjects for research purposes (Paolacci & Chandler, 2014). Research has also shown MTurk to be a reliable and cost-effective tool, capable of providing representative data for research in the behavioral sciences (e.g., Crump et al., 2013; Goodman et al., 2013; Mason & Suri, 2012; Rand, 2012; Simcox & Fiez, 2014). In addition to its use in social science studies, the platform has been used in marketing, hospitality and tourism, psychology, political science, communication, and sociology contexts (Sheehan, 2018). To illustrate, between 2012 and 2017, more than 40% of the studies published in the Journal of Consumer Research used crowdsourcing websites for their data collection (Goodman & Paolacci, 2017). Disadvantages Although researchers have assessed crowdsourcing platforms as reliable and cost-effective for data collection in the behavioral sciences, they are not exempt of flaws. One disadvantage is the possibility of unsatisfactory data quality. In fact, the virtual setting of the survey implies that the investigator is physically separated from the participant, and this lack of monitoring could lead to data quality issues (Sheehan, 2018). In addition, participants in survey research on crowdsourcing platforms are not always who they claim to be, creating issues of trust with the data provided and, ultimately, the quality of the research findings (McGonagle, 2015; Smith et al., 2016). A recurrent concern with MTurk workers, for instance, is their assessment as experienced survey takers (Chandler et al., 2015). This experience is mainly acquired through completion of dozens of surveys per day, especially when they are faced with similar items and scales. Smith et al. (2016) identified two types of problems performing data collection using MTurk; namely, cheaters and speeders. As compared to Qualtrics—which has a strict screening and quality-control processes to ensure that participants are who they claim to be—MTurk appears to be less exigent regarding the workers. However, a downside for data collection with Qualtrics is more expensive fees—about $5.00 per questionnaire on Qualtrics, against $0.50 to $1.50 on MTurk (Ford, 2017). Hence, few researchers were able to conduct surveys and compare respondent pools with Qualtrics or other traditional marketing research firms (Garrow et al., 2020). Another challenge using MTurk arises when trying to collect a desired number of responses from a population targeted to a specific city or area (Ross et al., 2010). The issues inherent to the selection process of MTurk have been the subject of investigations in several studies (e.g., Berinsky et al., 2012; Chandler et al., 2014; 2015; Harms & DeSimone, 2015; Paolacci et al., 2010; Rand, 2012). Feitosa et al. (2015) pointed out that international respondents may still identify themselves as U.S. respondents with the use of fake addresses and accounts. They found that 5% to 10% of participants identifying themselves as U.S. respondents were actually from overseas locations. Moreover, Babin et al. (2016) assessed that the use of trap questions allowed researchers to uncover that many respondents change their genders, ages, careers, or income within the course of a single survey. The issues of (a) experienced workers for the quality control of questions and (b) speeders, which, for MTurk can be attributed to the platform being the main source of revenue for a given respondent, remain the inherent issues of crowdsourcing platforms used for research purposes. Best practices Some best practices can be recommended in the use of crowdsourcing platforms for data collection purposes. Workers IDs can be matched with IDs from previous studies, thus allowing researchers to exclude responses from workers who had answered previous similar studies (Goodman & Paolacci, 2017). Furthermore, proceed to a manual assignment of qualification on MTurk prior to data collection (Litman et al., 2015; Park & Park, 2020). When dealing with experienced workers, both using multiple attention checks and optimizing the survey in a way to have the participants exposed to the stimuli for a sufficient length of time to better address the questions are also recommended (Sheehan, 2018). In this sense, shorter surveys are preferred to longer ones, which affect the participant’s concentration, and may, in turn, adversely impact the quality of their answers. Most importantly, pretest the survey to make sure that all parts are working as expected. Researchers should also keep in mind that in the context of MTurk, the primary method for measurement is the web interface. Thus, to avoid method biases, researchers should ponder whether or not method factors emerge in the latent measurement models (Podsakoff et al., 2012). As such, time-lagged research designs may be preferred as predictor and criterion variables can be measured at different points in time or administered in different platforms, such as Qualtrics vs MTurk (Cheung et al., 2017). In general, the use of crowdsourcing platforms including MTurk may be appropriate according to the research question; and the quality of data is reliant on the quality-control strategies used by researchers to enhance data quality. Trade-offs between various validity types need to be prioritized according to the research objectives (Cheung et al., 2017). From our experience using crowdsourcing tools for our own research as the editorial team members of several journals and chair of several conferences, we provide the best practices as outlined below: MTurk Worker (Respondent) Selection: Researchers should consider their study population before using MTurk for data collection. The MTurk platform should be used for the appropriate study population. For example, if the study targets restaurant owners or company CEOs, MTurk workers may not be suitable for the study. However, if the target population is diners, hotel guests, grocery shoppers, online shoppers, students, or hourly employees, utilizing a sample from MTurk would be suitable. Researchers should use the selection tool in the software. For example, if you target workers only from one country, exclude responses that came from an internet protocol (IP) address outside the targeted country and report the results in the method section. Researchers should consider the demographics of workers on MTurk which must reflect the study targeted population. For example, if the study focuses on baby boomers use of technology, then the MTurk sample should include only baby boomers. Similarly, the gender balance, racial composition, and income of people on MTurk should mirror the targeted population. Researchers should use multiple screening tools that identify quality respondents and avoid problematic response patterns. For example, MTurk provides the approval rate for the respondents. This refers to how many times a respondent is rejected for various reasons (i.e., wrong code entered). We recommend using a 90% or higher approv

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Global Business Insights

自引率

0.00%

发文量