{"title":"大规模英语语料库中区分大小写的字母和双字母频率计数。","authors":"Michael N Jones, D J K Mewhort","doi":"10.3758/bf03195586","DOIUrl":null,"url":null,"abstract":"<p><p>We tabulated upper- and lowercase letter frequency using several large-scale English corpora (approximately 183 million words in total). The results indicate that the relative frequencies for upper- and lowercase letters are not equivalent. We report a letter-naming experiment in which uppercase frequency predicted response time to uppercase letters better than did lowercase frequency. Tables of case-sensitive letter and bigram frequency are provided, including common nonalphabetic characters. Because subjects are sensitive to frequency relationships among letters, we recommend that experimenters use case-sensitive counts when constructing stimuli from letters.</p>","PeriodicalId":79800,"journal":{"name":"Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc","volume":"36 3","pages":"388-96"},"PeriodicalIF":0.0000,"publicationDate":"2004-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.3758/bf03195586","citationCount":"114","resultStr":"{\"title\":\"Case-sensitive letter and bigram frequency counts from large-scale English corpora.\",\"authors\":\"Michael N Jones, D J K Mewhort\",\"doi\":\"10.3758/bf03195586\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>We tabulated upper- and lowercase letter frequency using several large-scale English corpora (approximately 183 million words in total). The results indicate that the relative frequencies for upper- and lowercase letters are not equivalent. We report a letter-naming experiment in which uppercase frequency predicted response time to uppercase letters better than did lowercase frequency. Tables of case-sensitive letter and bigram frequency are provided, including common nonalphabetic characters. Because subjects are sensitive to frequency relationships among letters, we recommend that experimenters use case-sensitive counts when constructing stimuli from letters.</p>\",\"PeriodicalId\":79800,\"journal\":{\"name\":\"Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc\",\"volume\":\"36 3\",\"pages\":\"388-96\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2004-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.3758/bf03195586\",\"citationCount\":\"114\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3758/bf03195586\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3758/bf03195586","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Case-sensitive letter and bigram frequency counts from large-scale English corpora.
We tabulated upper- and lowercase letter frequency using several large-scale English corpora (approximately 183 million words in total). The results indicate that the relative frequencies for upper- and lowercase letters are not equivalent. We report a letter-naming experiment in which uppercase frequency predicted response time to uppercase letters better than did lowercase frequency. Tables of case-sensitive letter and bigram frequency are provided, including common nonalphabetic characters. Because subjects are sensitive to frequency relationships among letters, we recommend that experimenters use case-sensitive counts when constructing stimuli from letters.