转换在线语法教育：整合GETN V2和RoBERTa嵌入以实现有效的英语语法纠正

IF 6.8 2区工程技术 Q1 ENGINEERING, MULTIDISCIPLINARY

alexandria engineering journal Pub Date : 2025-09-26 DOI:10.1016/j.aej.2025.08.036

Yining Du

{"title":"转换在线语法教育：整合GETN V2和RoBERTa嵌入以实现有效的英语语法纠正","authors":"Yining Du","doi":"10.1016/j.aej.2025.08.036","DOIUrl":null,"url":null,"abstract":"<div><div>The increasing demand for intelligent language learning systems, traditional grammar correction models often fail to capture the complex syntactic dependencies and contextual nuances present in learner-generated texts. These limitations hinder accurate detection and correction of grammatical errors particularly in academic writing. This paper proposes a novel grammar correction framework that integrates deep contextual embeddings from RoBERTa with syntactic graph structures using Graph Embedded Transformer Network (GETN V2). The input sentences are first pre-processed through normalization, tokenization, syntactic parsing and stop word removal using the ChatLang-8 dataset. RoBERTa is then applied to extract high-dimensional contextual embeddings for each token capturing semantic dependencies. These embeddings are fused with syntactic graphs derived from dependency parsing where grammatical relationships such as subject–verb and modifier-noun are represented as labeled edges. The GETN V2 encoder combines these inputs through multi-relational message passing and relation-aware attention mechanisms dynamically weighting syntactic dependencies using graph-augmented transformer layers. A dual-module architecture performs error detection and correction in such a way that the detection layer identifies grammatical inconsistencies via masked attention and classification blocks, while the correction module leverages the confidence estimator and correction generator to refine output. The system also incorporates a feedback loop with a confusion matrix to dynamically update error correction strategies. Experimental evaluation on benchmark datasets demonstrates that the proposed model achieves an accuracy of 88.26% with F0.5 scores exceeding 84% on syntactically grounded errors such as subject–verb agreement and verb tense. Furthermore, the model maintains low latency under high concurrency making it suitable for real-time educational deployment.</div></div>","PeriodicalId":7484,"journal":{"name":"alexandria engineering journal","volume":"130 ","pages":"Pages 695-708"},"PeriodicalIF":6.8000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Transforming online grammar education: Integrating GETN V2 and RoBERTa embeddings for effective English grammar correction\",\"authors\":\"Yining Du\",\"doi\":\"10.1016/j.aej.2025.08.036\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The increasing demand for intelligent language learning systems, traditional grammar correction models often fail to capture the complex syntactic dependencies and contextual nuances present in learner-generated texts. These limitations hinder accurate detection and correction of grammatical errors particularly in academic writing. This paper proposes a novel grammar correction framework that integrates deep contextual embeddings from RoBERTa with syntactic graph structures using Graph Embedded Transformer Network (GETN V2). The input sentences are first pre-processed through normalization, tokenization, syntactic parsing and stop word removal using the ChatLang-8 dataset. RoBERTa is then applied to extract high-dimensional contextual embeddings for each token capturing semantic dependencies. These embeddings are fused with syntactic graphs derived from dependency parsing where grammatical relationships such as subject–verb and modifier-noun are represented as labeled edges. The GETN V2 encoder combines these inputs through multi-relational message passing and relation-aware attention mechanisms dynamically weighting syntactic dependencies using graph-augmented transformer layers. A dual-module architecture performs error detection and correction in such a way that the detection layer identifies grammatical inconsistencies via masked attention and classification blocks, while the correction module leverages the confidence estimator and correction generator to refine output. The system also incorporates a feedback loop with a confusion matrix to dynamically update error correction strategies. Experimental evaluation on benchmark datasets demonstrates that the proposed model achieves an accuracy of 88.26% with F0.5 scores exceeding 84% on syntactically grounded errors such as subject–verb agreement and verb tense. Furthermore, the model maintains low latency under high concurrency making it suitable for real-time educational deployment.</div></div>\",\"PeriodicalId\":7484,\"journal\":{\"name\":\"alexandria engineering journal\",\"volume\":\"130 \",\"pages\":\"Pages 695-708\"},\"PeriodicalIF\":6.8000,\"publicationDate\":\"2025-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"alexandria engineering journal\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1110016825009330\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"alexandria engineering journal","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110016825009330","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

随着人们对智能语言学习系统的需求不断增加，传统的语法纠正模型往往无法捕捉学习者生成的文本中复杂的句法依赖性和上下文的细微差别。这些限制阻碍了语法错误的准确发现和纠正，特别是在学术写作中。本文提出了一种新的语法校正框架，该框架使用图形嵌入式变压器网络（GETN V2）将RoBERTa的深度上下文嵌入与语法图结构相结合。输入的句子首先通过使用ChatLang-8数据集进行规范化、标记化、句法解析和停止词删除等预处理。然后应用RoBERTa为捕获语义依赖的每个令牌提取高维上下文嵌入。这些嵌入与依赖解析派生的语法图融合在一起，其中语法关系（如主谓关系和修饰语-名词关系）表示为标记边。GETN V2编码器通过多关系消息传递和关系感知关注机制组合这些输入，使用图形增强的转换器层动态加权语法依赖。双模块架构以这样一种方式执行错误检测和纠正：检测层通过隐藏的注意和分类块识别语法不一致，而纠正模块利用置信度估计器和纠正生成器来优化输出。该系统还结合了一个带有混淆矩阵的反馈回路来动态更新纠错策略。在基准数据集上的实验评估表明，该模型在主谓一致性和动词时态等句法基础错误上的准确率达到了88.26%,F0.5分超过了84%。此外，该模型在高并发性下保持了低延迟，适合于实时教育部署。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Transforming online grammar education: Integrating GETN V2 and RoBERTa embeddings for effective English grammar correction

The increasing demand for intelligent language learning systems, traditional grammar correction models often fail to capture the complex syntactic dependencies and contextual nuances present in learner-generated texts. These limitations hinder accurate detection and correction of grammatical errors particularly in academic writing. This paper proposes a novel grammar correction framework that integrates deep contextual embeddings from RoBERTa with syntactic graph structures using Graph Embedded Transformer Network (GETN V2). The input sentences are first pre-processed through normalization, tokenization, syntactic parsing and stop word removal using the ChatLang-8 dataset. RoBERTa is then applied to extract high-dimensional contextual embeddings for each token capturing semantic dependencies. These embeddings are fused with syntactic graphs derived from dependency parsing where grammatical relationships such as subject–verb and modifier-noun are represented as labeled edges. The GETN V2 encoder combines these inputs through multi-relational message passing and relation-aware attention mechanisms dynamically weighting syntactic dependencies using graph-augmented transformer layers. A dual-module architecture performs error detection and correction in such a way that the detection layer identifies grammatical inconsistencies via masked attention and classification blocks, while the correction module leverages the confidence estimator and correction generator to refine output. The system also incorporates a feedback loop with a confusion matrix to dynamically update error correction strategies. Experimental evaluation on benchmark datasets demonstrates that the proposed model achieves an accuracy of 88.26% with F0.5 scores exceeding 84% on syntactically grounded errors such as subject–verb agreement and verb tense. Furthermore, the model maintains low latency under high concurrency making it suitable for real-time educational deployment.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

alexandria engineering journal Engineering-General Engineering

CiteScore

11.20

自引率

4.40%

发文量

1015

审稿时长

43 days

期刊介绍： Alexandria Engineering Journal is an international journal devoted to publishing high quality papers in the field of engineering and applied science. Alexandria Engineering Journal is cited in the Engineering Information Services (EIS) and the Chemical Abstracts (CA). The papers published in Alexandria Engineering Journal are grouped into five sections, according to the following classification: • Mechanical, Production, Marine and Textile Engineering • Electrical Engineering, Computer Science and Nuclear Engineering • Civil and Architecture Engineering • Chemical Engineering and Applied Sciences • Environmental Engineering