{"title":"On over- and underuse in learner corpus research and multifactoriality in corpus linguistics more\n generally","authors":"S. Gries","doi":"10.1075/JSLS.00005.GRI","DOIUrl":null,"url":null,"abstract":"\n This paper critically discusses how corpus linguistics in general, but learner corpus research in particular, has been dealing with\n all sorts of frequency data in general, but over- and underuse frequencies in particular. I demonstrate on the basis of learner\n corpus data the pitfalls of using aggregate data and lacking statistical control that much work is unfortunately characterized by.\n In fact, I will demonstrate that monofactorial methods have very little to offer at all to research on observational data. While\n this paper is admittedly very didactic and methodological, I think the discussion of the empirical data offered here – a\n reanalysis of previously published work – shows how misleading many studies potentially and provides far-reaching implications for\n much of corpus linguistics and learner corpus research. Ideally/maximally, this paper together with Paquot & Plonsky (2017, Intntl. J. of Learner Corpus Research) would lead to a complete\n revision of how learner corpus linguists use quantitative methods and study over-/underuse; minimally, this paper would stimulate\n a much-needed discussion of currently lacking methodological sophistication.","PeriodicalId":29903,"journal":{"name":"Journal of Second Language Studies","volume":" ","pages":""},"PeriodicalIF":1.2000,"publicationDate":"2018-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Second Language Studies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1075/JSLS.00005.GRI","RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 29
Abstract
This paper critically discusses how corpus linguistics in general, but learner corpus research in particular, has been dealing with
all sorts of frequency data in general, but over- and underuse frequencies in particular. I demonstrate on the basis of learner
corpus data the pitfalls of using aggregate data and lacking statistical control that much work is unfortunately characterized by.
In fact, I will demonstrate that monofactorial methods have very little to offer at all to research on observational data. While
this paper is admittedly very didactic and methodological, I think the discussion of the empirical data offered here – a
reanalysis of previously published work – shows how misleading many studies potentially and provides far-reaching implications for
much of corpus linguistics and learner corpus research. Ideally/maximally, this paper together with Paquot & Plonsky (2017, Intntl. J. of Learner Corpus Research) would lead to a complete
revision of how learner corpus linguists use quantitative methods and study over-/underuse; minimally, this paper would stimulate
a much-needed discussion of currently lacking methodological sophistication.