Language TestingPub Date : 2022-08-16DOI: 10.1177/02655322221112364
Samuel D. Ihlenfeldt, Joseph A. Rios
{"title":"A meta-analysis on the predictive validity of English language proficiency assessments for college admissions","authors":"Samuel D. Ihlenfeldt, Joseph A. Rios","doi":"10.1177/02655322221112364","DOIUrl":"https://doi.org/10.1177/02655322221112364","url":null,"abstract":"For institutions where English is the primary language of instruction, English assessments for admissions such as the Test of English as a Foreign Language (TOEFL) and International English Language Testing System (IELTS) give admissions decision-makers a sense of a student’s skills in academic English. Despite this explicit purpose, these exams have also been used for the practice of predicting academic success. In this study, we meta-analytically synthesized 132 effect sizes from 32 studies containing validity evidence of academic English assessments to determine whether different assessments (a) predicted academic success (as measured by grade point average [GPA]) and (b) did so comparably. Overall, assessments had a weak positive correlation with academic achievement (r = .231, p < .001). Additionally, no significant differences were found in the predictive power of the IELTS and TOEFL exams. No moderators were significant, indicating that these findings held true across school type, school level, and publication type. Although significant, the overall correlation was low; thus, practitioners are cautioned from using standardized English-language proficiency test scores in isolation in lieu of a holistic application review during the admissions process.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"40 1","pages":"276 - 299"},"PeriodicalIF":4.1,"publicationDate":"2022-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46240900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Language TestingPub Date : 2022-08-15DOI: 10.1177/02655322221114895
Beverly A. Baker
{"title":"Book Review: Multilingual Testing and Assessment","authors":"Beverly A. Baker","doi":"10.1177/02655322221114895","DOIUrl":"https://doi.org/10.1177/02655322221114895","url":null,"abstract":"From both a theoretical and an empirical perspective, this volume addresses the challenges of testing learners of multiple school language(s). The author states that “This volume is intended as a non-technical resource to offer help and guidance to all those who work in education with multilingual populations” (p. 1). In that sense, it is not a book about the assessment of language per se (although she presents a research study in which she collects information on students’ language proficiency). Rather, it is intended primarily for non-language specialists; to educators working with multilingual learners of all subjects. As she states throughout the work, the author addresses what she sees as limitations in both theoretical and empirical work that consider two languages only, claiming that this work has limited insights for those working with speakers of more than two languages. The author is motivated by the fair assessment of all students, including linguistically and culturally minoritized students. What follows are a summary and critical comments of the book, beginning with an overview of each of the chapters then directing a critical commentary to a few chapters in particular (Chapters 2, 5, and 7). Given the repetition of ideas across the chapters, I assume that many of these chapters have been designed to be read on a stand-alone basis. I have chosen these chapters to focus my comments because in my view they form the core of the book—they contain the theoretical approach undergirding the author’s work, the practical guidance in the form of the author’s “integrated approach,” and the details of her empirical study.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"40 1","pages":"184 - 188"},"PeriodicalIF":4.1,"publicationDate":"2022-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45693580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Language TestingPub Date : 2022-08-09DOI: 10.1177/02655322221113917
Shuai Li, Ting-hui Wen, Xian Li, Yali Feng, Chuan Lin
{"title":"Comparing holistic and analytic marking methods in assessing speech act production in L2 Chinese","authors":"Shuai Li, Ting-hui Wen, Xian Li, Yali Feng, Chuan Lin","doi":"10.1177/02655322221113917","DOIUrl":"https://doi.org/10.1177/02655322221113917","url":null,"abstract":"This study compared holistic and analytic marking methods for their effects on parameter estimation (of examinees, raters, and items) and rater cognition in assessing speech act production in L2 Chinese. Seventy American learners of Chinese completed an oral Discourse Completion Test assessing requests and refusals. Four first-language (L1) Chinese raters evaluated the examinees’ oral productions using two four-point rating scales. The holistic scale simultaneously included the following five dimensions: communicative function, prosody, fluency, appropriateness, and grammaticality; the analytic scale included sub-scales to examine each of the five dimensions. The raters scored the dataset twice with the two marking methods, respectively, and with counterbalanced order. They also verbalized their scoring rationale after performing each rating. Results revealed that both marking methods led to high reliability and produced scores with high correlation; however, analytic marking possessed better assessment quality in terms of higher reliability and measurement precision, higher percentages of Rasch model fit for examinees and items, and more balanced reference to rating criteria among raters during the scoring process.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"40 1","pages":"249 - 275"},"PeriodicalIF":4.1,"publicationDate":"2022-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41463902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Language TestingPub Date : 2022-07-28DOI: 10.1177/02655322221113189
Atta Gebril
{"title":"Book Review: Challenges in Language Testing Around the World: Insights for Language Test Users","authors":"Atta Gebril","doi":"10.1177/02655322221113189","DOIUrl":"https://doi.org/10.1177/02655322221113189","url":null,"abstract":"With the increasing role of tests worldwide, language professionals and other stakeholders are regularly involved in a wide range of assessment-related decisions in their local contexts. Such decisions vary in terms of the stakes associated with them, with many involving high-stakes decisions. Regardless of the nature of the stakes, assessment contexts tend to share something in common: the challenges that test users encounter on a daily basis. To make things even worse, many test users operate in an instructional setting with little knowledge about assessment. Taylor (2009) refers to the lack of assessment literacy materials that are accessible to different stakeholders, arguing that such materials are “highly technical or too specialized for language educators seeking to understand basic principles and practice in assessment” (p. 23). On a related note, assessment literacy training tends to be offered in a one-size-fits-all manner and does not tap into the unique characteristics of local contexts. This view is in contradiction with what different researchers reported in the literature since assessment literacy is perceived as “a social and co-constructed construct,” “no longer viewed as passive accumulation of knowledge and skills” (Yan & Fan, 2021, p. 220), and tends to be impacted by a number of contextual factors, such as linguistic background and teaching experience (Crusan et al., 2016). In light of these issues, the current volume taps into the existing challenges in different assessment/instructional settings. It is rare in our field to find a volume dedicated mainly to challenges in different assessment/instructional settings. Usually, there is a general sense that practitioners do not prefer such a negative tone when reading or writing about language assessment practices. In addition, practitioners generally do not have the incentives and resources needed for publishing, nor do they have access to a suitable platform for sharing such experiences. Challenges in Language Testing Around the World: Insights for Language Test Users by Betty Lanteigne, Christine Coombe, and James Dean Brown is a good addition to the existing body of knowledge since it offers a closer look at “things that could get overlooked, misapplied, misinterpreted, misused” in different assessment projects (Lanteigne et al., 2021, p. v.). Another perspective that the authors have to be commended on is related to the international nature of the experiences reported in this volume. 1113189 LTJ0010.1177/02655322221113189Language TestingBook reviews book-reviews2022","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"40 1","pages":"180 - 183"},"PeriodicalIF":4.1,"publicationDate":"2022-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44568134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Language TestingPub Date : 2022-07-24DOI: 10.1177/02655322221100115
Ann-Kristin Helland Gujord
{"title":"Who succeeds and who fails? Exploring the role of background variables in explaining the outcomes of L2 language tests","authors":"Ann-Kristin Helland Gujord","doi":"10.1177/02655322221100115","DOIUrl":"https://doi.org/10.1177/02655322221100115","url":null,"abstract":"This study explores whether and to what extent the background information supplied by 10,155 immigrants who took an official language test in Norwegian affected their chances of passing one, two, or all three parts of the test. The background information included in the analysis was prior education, region (location of their home country), language (first language [L1] background, knowledge of English), second language (hours of second language [L2] instruction, L2 use), L1 community (years of residence, contact with L1 speakers), age, and gender. An ordered logistic regression analysis revealed that eight of the hypothesised explanatory variables significantly impacted the dependent variable (test result). Several of the significant variables relate to pre-immigration conditions, such as educational opportunities earlier in life. The findings have implications for language testing and also, to some extent, for the understanding of variation in learning outcomes.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"40 1","pages":"227 - 248"},"PeriodicalIF":4.1,"publicationDate":"2022-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45523649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Language TestingPub Date : 2022-05-12DOI: 10.1177/02655322221092388
Stefanie A. Wind
{"title":"A sequential approach to detecting differential rater functioning in sparse rater-mediated assessment networks","authors":"Stefanie A. Wind","doi":"10.1177/02655322221092388","DOIUrl":"https://doi.org/10.1177/02655322221092388","url":null,"abstract":"Researchers frequently evaluate rater judgments in performance assessments for evidence of differential rater functioning (DRF), which occurs when rater severity is systematically related to construct-irrelevant student characteristics after controlling for student achievement levels. However, researchers have observed that methods for detecting DRF may be limited in sparse rating designs, where it is not possible for every rater to score every student. In these designs, there is limited information with which to detect DRF. Sparse designs can also exacerbate the impact of artificial DRF, which occurs when raters are inaccurately flagged for DRF due to statistical artifacts. In this study, a sequential method is adapted from previous research on differential item functioning (DIF) that allows researchers to detect DRF more accurately and distinguish between true and artificial DRF. Analyses of data from a rater-mediated writing assessment and a simulation study demonstrate that the sequential approach results in different conclusions about which raters exhibit DRF. Moreover, the simulation study results suggest that the sequential procedure results in improved accuracy in DRF detection across a variety of rating design conditions. Practical implications for language testing research are discussed.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"40 1","pages":"209 - 226"},"PeriodicalIF":4.1,"publicationDate":"2022-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43741835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Language TestingPub Date : 2022-05-01DOI: 10.1177/02655322221076033
Melissa A. Bowles
{"title":"Using instructor judgment, learner corpora, and DIF to develop a placement test for Spanish L2 and heritage learners","authors":"Melissa A. Bowles","doi":"10.1177/02655322221076033","DOIUrl":"https://doi.org/10.1177/02655322221076033","url":null,"abstract":"This study details the development of a local test designed to place university Spanish students (n = 719) into one of the four different course levels and to distinguish between traditional L2 learners and early bilinguals on the basis of their linguistic knowledge, regardless of the variety of Spanish they were exposed to. Early bilinguals include two groups—heritage learners (HLs), who were exposed to Spanish in their homes and communities growing up, and early L2 learners with extensive Spanish exposure, often through dual immersion education, who are increasingly enrolling in university Spanish courses and tend to pattern with HLs. Expert instructor judgment and learner corpora contributed to item development, and 12 of 15 written multiple-choice test items targeting early-acquired vocabulary had differential item functioning (DIF) according to the Mantel–Haenszel procedure, favoring HLs. Recursive partitioning revealed that vocabulary score correctly identified 597/603 (99%) of L2 learners as such, and the six HLs whose vocabulary scores incorrectly identified them as L2 learners were in the lowest placement groups. Vocabulary scores also correctly identified 100% of the early L2 learners in the sample (n = 7) as having a heritage profile. Implications for the local context and for placement testing in general are provided.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"39 1","pages":"355 - 376"},"PeriodicalIF":4.1,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46971326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Language TestingPub Date : 2022-04-14DOI: 10.1177/02655322221076153
Gerriet Janssen
{"title":"Local placement test retrofit and building language assessment literacy with teacher stakeholders: A case study from Colombia","authors":"Gerriet Janssen","doi":"10.1177/02655322221076153","DOIUrl":"https://doi.org/10.1177/02655322221076153","url":null,"abstract":"This article provides a single, common-case study of a test retrofit project at one Colombian university. It reports on how the test retrofit project was carried out and describes the different areas of language assessment literacy the project afforded local teacher stakeholders. This project was successful in that it modified the test constructs and item types, while drawing stronger connections between the curriculum and the placement instrument. It also established a conceptual framework for the test and produced a more robust test form, psychometrically. The project intersected with different social forces, which impacted the project’s outcome in various ways. The project also illustrates how test retrofit provided local teachers with opportunities for language assessment literacy and with evidence-based knowledge about their students’ language proficiency. The study concludes that local assessment projects have the capacity to benefit local teachers, especially in terms of increased language assessment literacy. Intrinsic to a project’s sustainability are long-term financial commitment and institutionally established dedicated time, assigned to teacher participants. The study also concludes that project leadership requires both assessment and political skill sets, to conduct defensible research while compelling institutions to see the potential benefits of an ongoing test development or retrofit project.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"39 1","pages":"377 - 400"},"PeriodicalIF":4.1,"publicationDate":"2022-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48953721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Language TestingPub Date : 2022-04-04DOI: 10.1177/02655322221086211
J. Read
{"title":"Test Review: The International English Language Testing System (IELTS)","authors":"J. Read","doi":"10.1177/02655322221086211","DOIUrl":"https://doi.org/10.1177/02655322221086211","url":null,"abstract":"","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"39 1","pages":"679 - 694"},"PeriodicalIF":4.1,"publicationDate":"2022-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48370702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}