Sara Geremia , Thomas Gaillat , Nicolas Ballier , Andrew J. Simpkin
{"title":"Exploring the cross-lingual influence of linguistic complexity in second language writing assessment","authors":"Sara Geremia , Thomas Gaillat , Nicolas Ballier , Andrew J. Simpkin","doi":"10.1016/j.asw.2025.100951","DOIUrl":null,"url":null,"abstract":"<div><div>This paper explores the influence of L1 on the linguistic complexity of English learners. It relies on features extracted from texts and modelled using a statistical learning framework. Linguistic complexity is assessed automatically in terms of proficiency levels across different L1. We investigate whether proficiency grading by humans matches clusters of learner writings based on the similarity of linguistic features. We then use complexity metrics to automatically assess proficiency levels in samples of writings of different L1s. We focus on variable importance to understand which features best discriminate between levels. Analytic clusters of linguistic complexity data do not map well to learning levels, which promises poorly for the relevance of using language complexity metrics for level prediction. However, assessing L1 influence on linguistic complexity through a multinomial logistic regression with elastic net regularisation shows significant results. The models predict the proficiency levels of students of different L1s.</div></div>","PeriodicalId":46865,"journal":{"name":"Assessing Writing","volume":"66 ","pages":"Article 100951"},"PeriodicalIF":5.5000,"publicationDate":"2025-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Assessing Writing","FirstCategoryId":"98","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1075293525000388","RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 0
Abstract
This paper explores the influence of L1 on the linguistic complexity of English learners. It relies on features extracted from texts and modelled using a statistical learning framework. Linguistic complexity is assessed automatically in terms of proficiency levels across different L1. We investigate whether proficiency grading by humans matches clusters of learner writings based on the similarity of linguistic features. We then use complexity metrics to automatically assess proficiency levels in samples of writings of different L1s. We focus on variable importance to understand which features best discriminate between levels. Analytic clusters of linguistic complexity data do not map well to learning levels, which promises poorly for the relevance of using language complexity metrics for level prediction. However, assessing L1 influence on linguistic complexity through a multinomial logistic regression with elastic net regularisation shows significant results. The models predict the proficiency levels of students of different L1s.
期刊介绍:
Assessing Writing is a refereed international journal providing a forum for ideas, research and practice on the assessment of written language. Assessing Writing publishes articles, book reviews, conference reports, and academic exchanges concerning writing assessments of all kinds, including traditional (direct and standardised forms of) testing of writing, alternative performance assessments (such as portfolios), workplace sampling and classroom assessment. The journal focuses on all stages of the writing assessment process, including needs evaluation, assessment creation, implementation, and validation, and test development.