Language TestingPub Date : 2023-03-13DOI: 10.1177/02655322231151384
P. Tavakoli, Gill Kendon, Svetlana Mazhurnaya, A. Ziomek
{"title":"Assessment of fluency in the Test of English for Educational Purposes","authors":"P. Tavakoli, Gill Kendon, Svetlana Mazhurnaya, A. Ziomek","doi":"10.1177/02655322231151384","DOIUrl":"https://doi.org/10.1177/02655322231151384","url":null,"abstract":"The main aim of this study was to investigate how oral fluency is assessed across different levels of proficiency in the Test of English for Educational Purposes (TEEP). Working with data from 56 test-takers performing a monologic task at a range of proficiency levels (equivalent to approximately levels 5.0, 5.5, 6.5, and 7.5 in the IELTS scoring system), we used PRAAT analysis to measure speed, breakdown, and repair fluency. A multivariate analysis of variance and a series of analyses of variance were used to examine the differences between fluency measures at these different levels of proficiency. The results largely replicate previous research in this area suggesting that (a) speed measures distinguish between lower levels (5.0 and 5.5) and higher levels of proficiency (6.5 and 7.5), (b) breakdown measures of silent pauses distinguish between 5.0 and higher levels of 6.5 or 7.5, and (c) repair measures and filled pauses do not distinguish between any of the proficiency levels. Using the results, we have proposed changes that can help refine the fluency rating descriptors and rater training materials in the TEEP.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"40 1","pages":"607 - 629"},"PeriodicalIF":4.1,"publicationDate":"2023-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46177346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Language TestingPub Date : 2023-03-13DOI: 10.1177/02655322231156105
Yongzhi Miao
{"title":"The relationship among accent familiarity, shared L1, and comprehensibility: A path analysis perspective","authors":"Yongzhi Miao","doi":"10.1177/02655322231156105","DOIUrl":"https://doi.org/10.1177/02655322231156105","url":null,"abstract":"Scholars have argued for the inclusion of different spoken varieties of English in high-stakes listening tests to better represent the global use of English. However, doing so may introduce additional construct-irrelevant variance due to accent familiarity and the shared first language (L1) advantage, which could threaten test fairness. However, it is unclear to what extent accent familiarity and a shared L1 are related to or conflated with each other. The present study investigates the relationship between accent familiarity, a shared L1, and comprehensibility. Results from descriptive statistics and Mann–Whitney U test based on 302 second language (L2) English listeners’ responses to an online questionnaire suggested that a shared L1 meant high accent familiarity, but not vice versa. A path analysis revealed a complex relationship between accent familiarity, a shared L1, and comprehensibility. While a shared L1 had a direct effect on accent familiarity, and accent familiarity had a direct effect on comprehensibility, a shared L1 did not predict comprehensibility when accent familiarity was controlled for. These results disentangle accent familiarity from a shared L1. Researchers should consider both constructs when investigating fairness in relation to World Englishes for listening assessment.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"40 1","pages":"723 - 747"},"PeriodicalIF":4.1,"publicationDate":"2023-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48154760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Language TestingPub Date : 2023-03-07DOI: 10.1177/02655322231153543
Nathan Vandeweerd, Alex Housen, M. Paquot
{"title":"Proficiency at the lexis–grammar interface: Comparing oral versus written French exam tasks","authors":"Nathan Vandeweerd, Alex Housen, M. Paquot","doi":"10.1177/02655322231153543","DOIUrl":"https://doi.org/10.1177/02655322231153543","url":null,"abstract":"This study investigates whether re-thinking the separation of lexis and grammar in language testing could lead to more valid inferences about proficiency across modes. As argued by Römer, typical scoring rubrics ignore important information about proficiency encoded at the lexis–grammar interface, in particular how the co-selection of lexical and grammatical features is mediated by communicative function. This is especially evident when assessing oral versus written exam tasks, where the modality of a task may intersect with register-induced variation in linguistic output. This article presents the results of an empirical study in which we measured the diversity and sophistication of four-word lexical bundles extracted from a corpus of French proficiency exams. Analysis revealed that the diversity of noun-based bundles was a significant predictor of written proficiency scores and the sophistication of verb-based bundles was a significant predictor of proficiency scores across both modes, suggesting that communicative function as well as the constraints of online planning mediated the effect of lexicogrammatical phenomena on proficiency scores. Importantly, lexicogrammatical measures were better predictors of proficiency than solely lexical-based measures, which speaks to the potential utility of considering lexicogrammatical competence on scoring rubrics.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"40 1","pages":"658 - 683"},"PeriodicalIF":4.1,"publicationDate":"2023-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42701730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Language TestingPub Date : 2023-03-07DOI: 10.1177/02655322231152620
Nazlinur Gokturk, E. Chukharev-Hudilainen
{"title":"Strategy use in a spoken dialog system–delivered paired discussion task: A stimulated recall study","authors":"Nazlinur Gokturk, E. Chukharev-Hudilainen","doi":"10.1177/02655322231152620","DOIUrl":"https://doi.org/10.1177/02655322231152620","url":null,"abstract":"With recent technological advances, researchers have begun to explore the potential use of spoken dialog systems (SDSs) for L2 oral communication assessment. While several studies support the feasibility of building these systems for various types of oral tasks, research on the construct validity of SDS-delivered tasks is still limited. Thus, this study examines the cognitive processes engaged by an SDS-delivered paired discussion task, focusing on strategic competence, an essential component of L2 oral communication ability. Thirty adult test-takers completed a paired discussion task with an SDS acting as an interlocutor and provided stimulated recalls about their strategy use in the task. Three trained raters independently evaluated the test-takers’ oral task responses using a holistic rating scale devised for the task. Findings revealed the use of six categories of construct-relevant strategies during task performance. While no statistically significant differences were found in the use of these categories between high- and low-ability test-takers, marked differences were observed in the use of individual strategies within the categories between the test-takers at the two levels. These findings provide insight into how test-takers at different ability levels cognitively interact with SDS-delivered paired discussion tasks and offer implications for the design and validation of such tasks.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"40 1","pages":"630 - 657"},"PeriodicalIF":4.1,"publicationDate":"2023-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46064308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Language TestingPub Date : 2023-03-02DOI: 10.1177/02655322221149642
K. Eberharter, Judit Kormos, Elisa Guggenbichler, Viktoria S. Ebner, Shungo Suzuki, Doris Moser-Frötscher, Eva Konrad, B. Kremmel
{"title":"Investigating the impact of self-pacing on the L2 listening performance of young learner candidates with differing L1 literacy skills","authors":"K. Eberharter, Judit Kormos, Elisa Guggenbichler, Viktoria S. Ebner, Shungo Suzuki, Doris Moser-Frötscher, Eva Konrad, B. Kremmel","doi":"10.1177/02655322221149642","DOIUrl":"https://doi.org/10.1177/02655322221149642","url":null,"abstract":"In online environments, listening involves being able to pause or replay the recording as needed. Previous research indicates that control over the listening input could improve the measurement accuracy of listening assessment. Self-pacing also supports the second language (L2) comprehension processes of test-takers with specific learning difficulties (SpLDs) or, more specifically, of learners with reading-related learning difficulties who might have slower processing speed and limited working memory capacity. Our study examined how L1 literacy skills influence L2 listening performance in the standard single-listening and self-paced administration mode of the listening section of the Test of English as a Foreign Language (TOEFL) Junior Standard test. In a counterbalanced design, 139 Austrian learners of English completed 15 items in a standard single-listening condition and another 15 in a self-paced condition. L1 literacy skills were assessed via a standard reading, non-word reading, word-naming, and non-word repetition test. Generalized Linear Mixed-Effects Modelling revealed that self-pacing had no statistically significant effect on listening scores nor did it boost the performance of test-takers with lower L1 literacy scores indicative of reading-related SpLDs. The results indicate that young test-takers might require training in self-pacing or that self-paced conditions may need to be carefully implemented when they are offered to candidates with SpLDs.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"40 1","pages":"960 - 983"},"PeriodicalIF":4.1,"publicationDate":"2023-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48923531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Language TestingPub Date : 2023-03-01DOI: 10.1177/02655322231158554
B. Deygers
{"title":"Book Review: The sociology of assessment: Comparative and policy perspectives: The selected works of Patricia Broadfoot","authors":"B. Deygers","doi":"10.1177/02655322231158554","DOIUrl":"https://doi.org/10.1177/02655322231158554","url":null,"abstract":"Education is an emancipatory force in society, and centralized testing offers an objective way to select talented pupils, identify performant schools within an educational system, and compare educational systems on a global scale. Such is the traditional view of educational assessment. This view, however, is rooted in 19th-century positivistic thinking, is naïve in its belief in objective measurement and agnostic toward evidence to the contrary, so argues educational sociologist Patricia Broadfoot in her book The Sociology of Assessment. A collection of essays, chapters, and articles that span the esteemed educational sociologist’s career, this book is a testament to her interest in sociology and in comparative education. The volume has two central themes that weave its four sections together. First, it is a defense of comparative policy analysis. Broadfoot contends that education is a cultural project first and foremost and shows herself to be a fierce opponent of educational policies that serve a neoliberal agenda. Because education is embedded in a specific culture, comparative analysis helps to identify which aspects of an educational policy are context-specific and which are relatively constant across contexts. In other words, identifying idiosyncratic educational policies provokes questions about practices that may seem self-evident for people within a certain educational culture, but are not universal. One aspect of education that comparative analysis shows to be rather constant across systems is the central role of standardized testing as a driver of education. Exploring and critiquing the use of such tests as a driver of educational policy is the second central theme and the backbone of the book. The first section is the most conceptual and philosophical one. It establishes the core concepts of Broadfoot’s thinking and outlines what she sees as the primary functions of assessment: attesting competence, regulating competition and selection, determining and shaping educational content, and controlling educational quality. She also explains how these functions are linked to Durkheim’s work and zooms in on Weber, Bernstein, Bourdieu, Gramsci, and Foucault to lay the foundation of an argument that recurs throughout the book. This argument positions standardized educational testing as a 1158554 LTJ0010.1177/02655322231158554Language TestingBook Review book-review2023","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"40 1","pages":"840 - 843"},"PeriodicalIF":4.1,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43700050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Language TestingPub Date : 2023-02-15DOI: 10.1177/02655322221143928
K. Khan
{"title":"Book Review: Looking Like a Language, Sounding Like a Race: Raciolinguistic Ideologies and the Learning of Latinidad","authors":"K. Khan","doi":"10.1177/02655322221143928","DOIUrl":"https://doi.org/10.1177/02655322221143928","url":null,"abstract":"","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"40 1","pages":"457 - 460"},"PeriodicalIF":4.1,"publicationDate":"2023-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41375701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Language TestingPub Date : 2023-02-02DOI: 10.1177/02655322221147924
Kátia Monteiro, S. Crossley, Robert-Mihai Botarleanu, M. Dascalu
{"title":"L2 and L1 semantic context indices as automated measures of lexical sophistication","authors":"Kátia Monteiro, S. Crossley, Robert-Mihai Botarleanu, M. Dascalu","doi":"10.1177/02655322221147924","DOIUrl":"https://doi.org/10.1177/02655322221147924","url":null,"abstract":"Lexical frequency benchmarks have been extensively used to investigate second language (L2) lexical sophistication, especially in language assessment studies. However, indices based on semantic co-occurrence, which may be a better representation of the experience language users have with lexical items, have not been sufficiently tested as benchmarks of lexical sophistication. To address this gap, we developed and tested indices based on semantic co-occurrence from two computational methods, namely, Latent Semantic Analysis and Word2Vec. The indices were developed from one L2 written corpus (i.e., EF Cambridge Open Language Database [EF-CAMDAT]) and one first language (L1) written corpus (i.e., Corpus of Contemporary American English [COCA] Magazine). Available L1 semantic context indices (i.e., Touchstone Applied Sciences Associates [TASA] indices) were also assessed. To validate the indices, they were used to predict L2 essay quality scores as judged by human raters. The models suggested that the semantic context indices developed from EF-CAMDAT and TASA, but not the COCA Magazine indices, explained unique variance in the presence of lexical sophistication measures. This study suggests that semantic context indices based on multi-level corpora, including L2 corpora, may provide a useful representation of the experience L2 writers have with input, which may assist with automatic scoring of L2 writing.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"40 1","pages":"576 - 606"},"PeriodicalIF":4.1,"publicationDate":"2023-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47590675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Language TestingPub Date : 2023-02-02DOI: 10.1177/02655322221149009
Ahyoung Alicia Kim, Meltem Yumsek, J. Kemp, Mark Chapman, H. Gary Cook
{"title":"Universal tools activation in English language proficiency assessments: A comparison of Grades 1–12 English learners with and without disabilities","authors":"Ahyoung Alicia Kim, Meltem Yumsek, J. Kemp, Mark Chapman, H. Gary Cook","doi":"10.1177/02655322221149009","DOIUrl":"https://doi.org/10.1177/02655322221149009","url":null,"abstract":"English learners (ELs) comprise approximately 10% of kindergarten to Grade 12 students in US public schools, with about 15% of ELs identified as having disabilities. English language proficiency (ELP) assessments must adhere to universal design principles and incorporate universal tools, designed to increase accessibility for all ELs, including those with disabilities. This two-phase mixed methods study examined the extent Grades 1–12 ELs with and without disabilities activated universal tools during an online ELP assessment: Color Overlay, Color Contrast, Help Tools, Line Guide, Highlighter, Magnifier, and Sticky Notes. In Phase 1, analyses were conducted on 1.25 million students’ test and telemetry data (record of keystrokes and clicks). Phase 2 involved interviewing 55 ELs after test administration. Findings show that ELs activated the Line Guide, Highlighter, and Magnifier more frequently than others. The tool activation rate was higher in listening and reading domains than in speaking and writing. A significantly higher percentage of ELs with disabilities activated the tools than ELs without disabilities, but effect sizes were small; interview findings further revealed students’ rationale for tool use. Results indicate differences in ELs’ activation of universal tools depending on their disability category and language domain, providing evidence for the usefulness of these tools.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"40 1","pages":"877 - 903"},"PeriodicalIF":4.1,"publicationDate":"2023-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45570526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Language TestingPub Date : 2023-01-12DOI: 10.1177/02655322221145643
Marcus Warnby, Hans Malmström, Kajsa Yang Hansen
{"title":"Linking scores from two written receptive English academic vocabulary tests—The VLT-Ac and the AVT","authors":"Marcus Warnby, Hans Malmström, Kajsa Yang Hansen","doi":"10.1177/02655322221145643","DOIUrl":"https://doi.org/10.1177/02655322221145643","url":null,"abstract":"The academic section of the Vocabulary Levels Test (VLT-Ac) and the Academic Vocabulary Test (AVT) both assess meaning-recognition knowledge of written receptive academic vocabulary, deemed central for engagement in academic activities. Depending on the purpose and context of the testing, either of the tests can be appropriate, but for research and pedagogical purposes, it is important to be able to compare scores achieved on the two tests between administrations and within similar contexts. Based on a sample of 385 upper secondary school students in university-preparatory programs (independent CEFR B2-level users of English), this study presents a comparison model by linking the VLT-Ac and the AVT using concurrent calibration procedures in Item Response Theory. The key outcome of the study is a score comparison table providing a means for approximate score comparisons. Additionally, the study showcases a viable and valid method of comparing vocabulary scores from an older test with those from a newer one.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"40 1","pages":"548 - 575"},"PeriodicalIF":4.1,"publicationDate":"2023-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46502650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}