{"title":"Automated analysis of common errors in L2 learner production: Prototype web application development","authors":"Atsushi Mizumoto","doi":"10.1017/s0272263125100934","DOIUrl":null,"url":null,"abstract":"<p>This research report presents the development and validation of <span>Auto Error Analyzer</span>, a prototype web application designed to automate the calculation of accuracy and its related metrics for measuring second language (L2) production. Building on recent advancements in natural language processing (NLP) and artificial intelligence (AI), Auto Error Analyzer introduces an automated accuracy measurement component, bridging a gap in existing assessment tools, which traditionally require human judgment for accuracy evaluation. By utilizing a state-of-the-art generative AI model (Llama 3.3) for error detection, Auto Error Analyzer analyzes L2 texts efficiently and cost-effectively, producing accuracy metrics (e.g., errors per 100 words). Validation results demonstrate high agreement between the tool’s error counts and human rater judgments (<span>r</span> = .94), with microaverage precision and recall in error detection being high as well (.96 and .94 respectively, <span>F1</span> = .95), and its T-unit and clause counts matched outputs from established tools like L2SCA. Developed under open science principles to ensure transparency and replicability, the tool aims to support researchers and educators while emphasizing the complementary role of human expertise in language assessment. The possibilities of Auto Error Analyzer for efficient and scalable error analysis, as well as its limitations in detecting context-dependent and first-language (L1)-influenced errors, are also discussed.</p>","PeriodicalId":22008,"journal":{"name":"Studies in Second Language Acquisition","volume":"77 1","pages":""},"PeriodicalIF":4.2000,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Studies in Second Language Acquisition","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1017/s0272263125100934","RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"LINGUISTICS","Score":null,"Total":0}
引用次数: 0
Abstract
This research report presents the development and validation of Auto Error Analyzer, a prototype web application designed to automate the calculation of accuracy and its related metrics for measuring second language (L2) production. Building on recent advancements in natural language processing (NLP) and artificial intelligence (AI), Auto Error Analyzer introduces an automated accuracy measurement component, bridging a gap in existing assessment tools, which traditionally require human judgment for accuracy evaluation. By utilizing a state-of-the-art generative AI model (Llama 3.3) for error detection, Auto Error Analyzer analyzes L2 texts efficiently and cost-effectively, producing accuracy metrics (e.g., errors per 100 words). Validation results demonstrate high agreement between the tool’s error counts and human rater judgments (r = .94), with microaverage precision and recall in error detection being high as well (.96 and .94 respectively, F1 = .95), and its T-unit and clause counts matched outputs from established tools like L2SCA. Developed under open science principles to ensure transparency and replicability, the tool aims to support researchers and educators while emphasizing the complementary role of human expertise in language assessment. The possibilities of Auto Error Analyzer for efficient and scalable error analysis, as well as its limitations in detecting context-dependent and first-language (L1)-influenced errors, are also discussed.
期刊介绍:
Studies in Second Language Acquisition is a refereed journal of international scope devoted to the scientific discussion of acquisition or use of non-native and heritage languages. Each volume (five issues) contains research articles of either a quantitative, qualitative, or mixed-methods nature in addition to essays on current theoretical matters. Other rubrics include shorter articles such as Replication Studies, Critical Commentaries, and Research Reports.