{"title":"从覆盖到分布:探索全国大学入学英语考试的词汇特点","authors":"Tao Yang, Zhenhui Liang","doi":"10.56028/aehssr.9.1.178.2024","DOIUrl":null,"url":null,"abstract":"Lexical feature has long been a pivotal element of almost all high-stakes language tests. Since the implementation of the “Experimental Curriculum Criteria” in China, few studies have reported investigating lexical features of the ensuing National Matriculation English Test with corpus methodology, and notably none was conducted on word dispersion. To address this problem, Python programming was employed in the present study to perform a corpus-based two-way coverage and visualized distribution analysis between the National Matriculation English Test and Experimental Curriculum Criteria lexicon. It was found that: 1) text coverage of the National Matriculation English Test reached the minimal (95%) threshold yet not the optimal (98%) one for adequate comprehension; 2) word-list coverage of the Experimental Curriculum Criteria was disproportionate and insufficient, suggesting that a large volume (42.905%) of the prescribed lexicon has never been used during the 13 years of implementation; 3) a relatively few (N = 90) high-frequency words, most (74.444%) of which were significantly overused compared with their corresponding BNC frequency, constituted over half (51.403%) of the text coverage; and 4) a vast majority (93.333%) of high-frequency words was homogeneous in dispersion, confirming the overuse with fresh distribution evidence. The results are discussed in terms of implications for test development.","PeriodicalId":502379,"journal":{"name":"Advances in Education, Humanities and Social Science Research","volume":"212 ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"From Coverage to Distribution: Exploring Lexical Features of the National Matriculation English Test\",\"authors\":\"Tao Yang, Zhenhui Liang\",\"doi\":\"10.56028/aehssr.9.1.178.2024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Lexical feature has long been a pivotal element of almost all high-stakes language tests. Since the implementation of the “Experimental Curriculum Criteria” in China, few studies have reported investigating lexical features of the ensuing National Matriculation English Test with corpus methodology, and notably none was conducted on word dispersion. To address this problem, Python programming was employed in the present study to perform a corpus-based two-way coverage and visualized distribution analysis between the National Matriculation English Test and Experimental Curriculum Criteria lexicon. It was found that: 1) text coverage of the National Matriculation English Test reached the minimal (95%) threshold yet not the optimal (98%) one for adequate comprehension; 2) word-list coverage of the Experimental Curriculum Criteria was disproportionate and insufficient, suggesting that a large volume (42.905%) of the prescribed lexicon has never been used during the 13 years of implementation; 3) a relatively few (N = 90) high-frequency words, most (74.444%) of which were significantly overused compared with their corresponding BNC frequency, constituted over half (51.403%) of the text coverage; and 4) a vast majority (93.333%) of high-frequency words was homogeneous in dispersion, confirming the overuse with fresh distribution evidence. The results are discussed in terms of implications for test development.\",\"PeriodicalId\":502379,\"journal\":{\"name\":\"Advances in Education, Humanities and Social Science Research\",\"volume\":\"212 \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advances in Education, Humanities and Social Science Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.56028/aehssr.9.1.178.2024\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Education, Humanities and Social Science Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.56028/aehssr.9.1.178.2024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
From Coverage to Distribution: Exploring Lexical Features of the National Matriculation English Test
Lexical feature has long been a pivotal element of almost all high-stakes language tests. Since the implementation of the “Experimental Curriculum Criteria” in China, few studies have reported investigating lexical features of the ensuing National Matriculation English Test with corpus methodology, and notably none was conducted on word dispersion. To address this problem, Python programming was employed in the present study to perform a corpus-based two-way coverage and visualized distribution analysis between the National Matriculation English Test and Experimental Curriculum Criteria lexicon. It was found that: 1) text coverage of the National Matriculation English Test reached the minimal (95%) threshold yet not the optimal (98%) one for adequate comprehension; 2) word-list coverage of the Experimental Curriculum Criteria was disproportionate and insufficient, suggesting that a large volume (42.905%) of the prescribed lexicon has never been used during the 13 years of implementation; 3) a relatively few (N = 90) high-frequency words, most (74.444%) of which were significantly overused compared with their corresponding BNC frequency, constituted over half (51.403%) of the text coverage; and 4) a vast majority (93.333%) of high-frequency words was homogeneous in dispersion, confirming the overuse with fresh distribution evidence. The results are discussed in terms of implications for test development.