Huiying Cai, Xun Yan, Ping-Lin Chuang, Yulin Pan, Mingyue Huo
{"title":"What makes listening comprehension difficult?: A feature-based machine learning approach to understanding item difficulty","authors":"Huiying Cai, Xun Yan, Ping-Lin Chuang, Yulin Pan, Mingyue Huo","doi":"10.1093/applin/amaf079","DOIUrl":null,"url":null,"abstract":"Understanding what makes second language (L2) listening comprehension difficult is crucial for advancing language learning and assessment. In L2 listening assessment, a key challenge is developing items with targeted difficulty levels. This difficulty can be influenced by textual and acoustic features from different item segments (i.e. stimuli, stems, and options) embedded in a multi-layered structure, along with task-related features. This study explores a feature-based machine learning (ML) approach to predicting difficulty of multiple-choice listening items on a local language proficiency test. We extracted construct-relevant textual and acoustic features from item segments across five dimensions: lexical complexity, syntactic complexity, fluency, pronunciation, and similarities among item segments. Incorporating these features, we compared traditional and mixed-effects ML models for predictive accuracy and interpretability. The best-performing model—a mixed-effects Ridge model with twenty-three features—achieved high accuracy (R2 = 0.860) and showed meaningful feature-difficulty relationships. This study presents methodological innovations for item difficulty modeling and offers practical implications for human- and machine-mediated item development. It also demonstrates potential of incorporating computational linguistics and ML in enhancing L2 listening assessment.","PeriodicalId":48234,"journal":{"name":"Applied Linguistics","volume":"84 1","pages":""},"PeriodicalIF":4.2000,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Linguistics","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1093/applin/amaf079","RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"LINGUISTICS","Score":null,"Total":0}
引用次数: 0
Abstract
Understanding what makes second language (L2) listening comprehension difficult is crucial for advancing language learning and assessment. In L2 listening assessment, a key challenge is developing items with targeted difficulty levels. This difficulty can be influenced by textual and acoustic features from different item segments (i.e. stimuli, stems, and options) embedded in a multi-layered structure, along with task-related features. This study explores a feature-based machine learning (ML) approach to predicting difficulty of multiple-choice listening items on a local language proficiency test. We extracted construct-relevant textual and acoustic features from item segments across five dimensions: lexical complexity, syntactic complexity, fluency, pronunciation, and similarities among item segments. Incorporating these features, we compared traditional and mixed-effects ML models for predictive accuracy and interpretability. The best-performing model—a mixed-effects Ridge model with twenty-three features—achieved high accuracy (R2 = 0.860) and showed meaningful feature-difficulty relationships. This study presents methodological innovations for item difficulty modeling and offers practical implications for human- and machine-mediated item development. It also demonstrates potential of incorporating computational linguistics and ML in enhancing L2 listening assessment.
期刊介绍:
Applied Linguistics publishes research into language with relevance to real-world problems. The journal is keen to help make connections between fields, theories, research methods, and scholarly discourses, and welcomes contributions which critically reflect on current practices in applied linguistic research. It promotes scholarly and scientific discussion of issues that unite or divide scholars in applied linguistics. It is less interested in the ad hoc solution of particular problems and more interested in the handling of problems in a principled way by reference to theoretical studies.