{"title":"The Words that Count","authors":"Richard M. Chapman","doi":"10.1558/jrds.40339","DOIUrl":null,"url":null,"abstract":"Computer assisted discourse studies (CADS) undoubtedly offers great prospects in our attempts to observe and understand the use and social effects of language in context, but it does so with a caveat: we need to be constantly aware of our significant (and often implicit) assumptions when attempting to reach beyond electronically analysed masses of text into assessments of pragmatic (and so social) behaviour. This paper aims to remind us of the need for reflection on our most basic assumptions as we begin to make use of more complex and refined procedures and start to make more ambitious claims about what various corpora can show us. It is argued that concepts such as the word, tokens, types and frequency require constant re-evaluation, in particular when we are using data that have been extracted from their original textual (and so contextualized) sources in the creation of corpora. It is hoped that a small contribution can be made to the debate about the empirical approach to understanding language, perhaps in terms of methodologies to be utilized, the potential extent and limits of CADS, or in terms of presenting or interpreting the results and conclusions of published studies.","PeriodicalId":230971,"journal":{"name":"Journal of Research Design and Statistics in Linguistics and Communication Science","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Research Design and Statistics in Linguistics and Communication Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1558/jrds.40339","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Computer assisted discourse studies (CADS) undoubtedly offers great prospects in our attempts to observe and understand the use and social effects of language in context, but it does so with a caveat: we need to be constantly aware of our significant (and often implicit) assumptions when attempting to reach beyond electronically analysed masses of text into assessments of pragmatic (and so social) behaviour. This paper aims to remind us of the need for reflection on our most basic assumptions as we begin to make use of more complex and refined procedures and start to make more ambitious claims about what various corpora can show us. It is argued that concepts such as the word, tokens, types and frequency require constant re-evaluation, in particular when we are using data that have been extracted from their original textual (and so contextualized) sources in the creation of corpora. It is hoped that a small contribution can be made to the debate about the empirical approach to understanding language, perhaps in terms of methodologies to be utilized, the potential extent and limits of CADS, or in terms of presenting or interpreting the results and conclusions of published studies.