{"title":"Case-sensitive letter and bigram frequency counts from large-scale English corpora.","authors":"Michael N Jones, D J K Mewhort","doi":"10.3758/bf03195586","DOIUrl":null,"url":null,"abstract":"<p><p>We tabulated upper- and lowercase letter frequency using several large-scale English corpora (approximately 183 million words in total). The results indicate that the relative frequencies for upper- and lowercase letters are not equivalent. We report a letter-naming experiment in which uppercase frequency predicted response time to uppercase letters better than did lowercase frequency. Tables of case-sensitive letter and bigram frequency are provided, including common nonalphabetic characters. Because subjects are sensitive to frequency relationships among letters, we recommend that experimenters use case-sensitive counts when constructing stimuli from letters.</p>","PeriodicalId":79800,"journal":{"name":"Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc","volume":"36 3","pages":"388-96"},"PeriodicalIF":0.0000,"publicationDate":"2004-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.3758/bf03195586","citationCount":"114","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3758/bf03195586","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 114
Abstract
We tabulated upper- and lowercase letter frequency using several large-scale English corpora (approximately 183 million words in total). The results indicate that the relative frequencies for upper- and lowercase letters are not equivalent. We report a letter-naming experiment in which uppercase frequency predicted response time to uppercase letters better than did lowercase frequency. Tables of case-sensitive letter and bigram frequency are provided, including common nonalphabetic characters. Because subjects are sensitive to frequency relationships among letters, we recommend that experimenters use case-sensitive counts when constructing stimuli from letters.