应用语言学第二语言研究中的异类：综合与数据再分析

IF 4.3 1区文学 0 LANGUAGE & LINGUISTICS

Annual Review of Applied Linguistics Pub Date : 2020-03-01 DOI:10.1017/S0267190520000057

Christopher Nicklin, Luke Plonsky

{"title":"应用语言学第二语言研究中的异类：综合与数据再分析","authors":"Christopher Nicklin, Luke Plonsky","doi":"10.1017/S0267190520000057","DOIUrl":null,"url":null,"abstract":"Abstract Data from self-paced reading (SPR) tasks are routinely checked for statistical outliers (Marsden, Thompson, & Plonsky, 2018). Such data points can be handled in a variety of ways (e.g., trimming, data transformation), each of which may influence study results in a different manner. This two-phase study sought, first, to systematically review outlier handling techniques found in studies that involve SPR and, second, to re-analyze raw data from SPR tasks to understand the impact of those techniques. Toward these ends, in Phase I, a sample of 104 studies that employed SPR tasks was collected and coded for different outlier treatments. As found in Marsden et al. (2018), wide variability was observed across the sample in terms of selection of time and standard deviation (SD)-based boundaries for determining what constitutes a legitimate reading time (RT). In Phase II, the raw data from the SPR studies in Phase I were requested from the authors. Nineteen usable datasets were obtained and re-analyzed using data transformations, SD boundaries, trimming, and winsorizing, in order to test their relative effectiveness for normalizing SPR reaction time data. The results suggested that, in the vast majority of cases, logarithmic transformation circumvented the need for SD boundaries, which blindly eliminate or alter potentially legitimate data. The results also indicated that choice of SD boundary had little influence on the data and revealed no meaningful difference between trimming and winsorizing, implying that blindly removing data from SPR analyses might be unnecessary. Suggestions are provided for future research involving SPR data and the handling of outliers in second language (L2) research more generally.","PeriodicalId":47490,"journal":{"name":"Annual Review of Applied Linguistics","volume":"40 1","pages":"26 - 55"},"PeriodicalIF":4.3000,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1017/S0267190520000057","citationCount":"33","resultStr":"{\"title\":\"Outliers in L2 Research in Applied Linguistics: A Synthesis and Data Re-Analysis\",\"authors\":\"Christopher Nicklin, Luke Plonsky\",\"doi\":\"10.1017/S0267190520000057\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Data from self-paced reading (SPR) tasks are routinely checked for statistical outliers (Marsden, Thompson, & Plonsky, 2018). Such data points can be handled in a variety of ways (e.g., trimming, data transformation), each of which may influence study results in a different manner. This two-phase study sought, first, to systematically review outlier handling techniques found in studies that involve SPR and, second, to re-analyze raw data from SPR tasks to understand the impact of those techniques. Toward these ends, in Phase I, a sample of 104 studies that employed SPR tasks was collected and coded for different outlier treatments. As found in Marsden et al. (2018), wide variability was observed across the sample in terms of selection of time and standard deviation (SD)-based boundaries for determining what constitutes a legitimate reading time (RT). In Phase II, the raw data from the SPR studies in Phase I were requested from the authors. Nineteen usable datasets were obtained and re-analyzed using data transformations, SD boundaries, trimming, and winsorizing, in order to test their relative effectiveness for normalizing SPR reaction time data. The results suggested that, in the vast majority of cases, logarithmic transformation circumvented the need for SD boundaries, which blindly eliminate or alter potentially legitimate data. The results also indicated that choice of SD boundary had little influence on the data and revealed no meaningful difference between trimming and winsorizing, implying that blindly removing data from SPR analyses might be unnecessary. Suggestions are provided for future research involving SPR data and the handling of outliers in second language (L2) research more generally.\",\"PeriodicalId\":47490,\"journal\":{\"name\":\"Annual Review of Applied Linguistics\",\"volume\":\"40 1\",\"pages\":\"26 - 55\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2020-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1017/S0267190520000057\",\"citationCount\":\"33\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annual Review of Applied Linguistics\",\"FirstCategoryId\":\"98\",\"ListUrlMain\":\"https://doi.org/10.1017/S0267190520000057\",\"RegionNum\":1,\"RegionCategory\":\"文学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"LANGUAGE & LINGUISTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annual Review of Applied Linguistics","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1017/S0267190520000057","RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}

引用次数: 33

摘要

摘要自节奏阅读（SPR）任务的数据定期检查统计异常值（Marsden，Thompson，&Plonsky，2018）。这些数据点可以用多种方式处理（例如，修剪、数据转换），每种方式都可能以不同的方式影响研究结果。这项分两阶段的研究首先试图系统地审查在涉及SPR的研究中发现的异常值处理技术，其次，重新分析SPR任务的原始数据，以了解这些技术的影响。为此，在第一阶段，收集了104项采用SPR任务的研究样本，并对不同的异常处理进行了编码。正如Marsden等人（2018）所发现的那样，在整个样本中，在选择时间和基于标准差（SD）的边界以确定什么构成合法阅读时间（RT）方面，观察到了广泛的可变性。在第二阶段，要求作者提供第一阶段SPR研究的原始数据。获得了19个可用的数据集，并使用数据转换、SD边界、修剪和winsorizing对其进行了重新分析，以测试它们对SPR反应时间数据归一化的相对有效性。结果表明，在绝大多数情况下，对数变换避开了SD边界的需要，后者盲目地消除或更改潜在的合法数据。结果还表明，SD边界的选择对数据的影响很小，并且在修剪和winsorizing之间没有显示出有意义的差异，这意味着从SPR分析中盲目删除数据可能是不必要的。为未来涉及SPR数据的研究以及更广泛地处理第二语言（L2）研究中的异常值提供了建议。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Outliers in L2 Research in Applied Linguistics: A Synthesis and Data Re-Analysis

Abstract Data from self-paced reading (SPR) tasks are routinely checked for statistical outliers (Marsden, Thompson, & Plonsky, 2018). Such data points can be handled in a variety of ways (e.g., trimming, data transformation), each of which may influence study results in a different manner. This two-phase study sought, first, to systematically review outlier handling techniques found in studies that involve SPR and, second, to re-analyze raw data from SPR tasks to understand the impact of those techniques. Toward these ends, in Phase I, a sample of 104 studies that employed SPR tasks was collected and coded for different outlier treatments. As found in Marsden et al. (2018), wide variability was observed across the sample in terms of selection of time and standard deviation (SD)-based boundaries for determining what constitutes a legitimate reading time (RT). In Phase II, the raw data from the SPR studies in Phase I were requested from the authors. Nineteen usable datasets were obtained and re-analyzed using data transformations, SD boundaries, trimming, and winsorizing, in order to test their relative effectiveness for normalizing SPR reaction time data. The results suggested that, in the vast majority of cases, logarithmic transformation circumvented the need for SD boundaries, which blindly eliminate or alter potentially legitimate data. The results also indicated that choice of SD boundary had little influence on the data and revealed no meaningful difference between trimming and winsorizing, implying that blindly removing data from SPR analyses might be unnecessary. Suggestions are provided for future research involving SPR data and the handling of outliers in second language (L2) research more generally.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Annual Review of Applied Linguistics Multiple-

CiteScore

6.70

自引率

5.40%

发文量

期刊介绍： The Annual Review of Applied Linguistics publishes research on key topics in the broad field of applied linguistics. Each issue is thematic, providing a variety of perspectives on the topic through research summaries, critical overviews, position papers and empirical studies. Being responsive to the field, some issues are tied to the theme of that year''s annual conference of the American Association for Applied Linguistics. Also, at regular intervals an issue will take the approach of covering applied linguistics as a field more broadly, including coverage of critical or controversial topics. ARAL provides cutting-edge and timely articles on a wide number of areas, including language learning and pedagogy, second language acquisition, sociolinguistics, language policy and planning, language assessment, and research design and methodology, to name just a few.