机器翻译在二语西班牙语写作中的自动检测

IF 3.8 1区文学 Q1 EDUCATION & EDUCATIONAL RESEARCH

Language Teaching Research Pub Date : 2025-08-30 DOI:10.1177/13621688251352263

Luciane L Maimone, Falcon Restrepo Ramos, Jason Jolley

{"title":"机器翻译在二语西班牙语写作中的自动检测","authors":"Luciane L Maimone, Falcon Restrepo Ramos, Jason Jolley","doi":"10.1177/13621688251352263","DOIUrl":null,"url":null,"abstract":"Google Translate (GT) has become a popular machine translation (MT) tool among language learners, received by instructors with excitement over its pedagogical potential and concerns about its possible misuse in the classroom, particularly when this misuse goes undetected. This study investigated the suitability of natural language processing (NLP) software for the automated detection of MT use in second language (L2) writing, examining a dataset composed of written samples generated by GT and direct L2 writing produced by intermediate-level postsecondary learners of Spanish. NLP-powered analyses found significant lexical and sentential-level differences, as well as estimated proficiency-level differences across text types. Automated judgments based on lexical diversity and amount of coordination yielded detection accuracy rates of 73.08% each, whereas proficiency estimates informed correct automated judgments with an overall accuracy rate of 86.54%. An automated reverse-translation protocol using probability estimates was capable of differentiating between direct L2 writing and MT-assisted texts 98% of the time, far surpassing human detection rates (73%) found in a previous study for the same dataset. These findings argue strongly for the potential of NLP-driven textual analysis as a reliable tool to assist instructors in detecting unauthorized uses of MT in L2 writing.","PeriodicalId":47852,"journal":{"name":"Language Teaching Research","volume":"117 1","pages":""},"PeriodicalIF":3.8000,"publicationDate":"2025-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automated detection of machine translation use in L2 Spanish writing\",\"authors\":\"Luciane L Maimone, Falcon Restrepo Ramos, Jason Jolley\",\"doi\":\"10.1177/13621688251352263\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Google Translate (GT) has become a popular machine translation (MT) tool among language learners, received by instructors with excitement over its pedagogical potential and concerns about its possible misuse in the classroom, particularly when this misuse goes undetected. This study investigated the suitability of natural language processing (NLP) software for the automated detection of MT use in second language (L2) writing, examining a dataset composed of written samples generated by GT and direct L2 writing produced by intermediate-level postsecondary learners of Spanish. NLP-powered analyses found significant lexical and sentential-level differences, as well as estimated proficiency-level differences across text types. Automated judgments based on lexical diversity and amount of coordination yielded detection accuracy rates of 73.08% each, whereas proficiency estimates informed correct automated judgments with an overall accuracy rate of 86.54%. An automated reverse-translation protocol using probability estimates was capable of differentiating between direct L2 writing and MT-assisted texts 98% of the time, far surpassing human detection rates (73%) found in a previous study for the same dataset. These findings argue strongly for the potential of NLP-driven textual analysis as a reliable tool to assist instructors in detecting unauthorized uses of MT in L2 writing.\",\"PeriodicalId\":47852,\"journal\":{\"name\":\"Language Teaching Research\",\"volume\":\"117 1\",\"pages\":\"\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2025-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Language Teaching Research\",\"FirstCategoryId\":\"98\",\"ListUrlMain\":\"https://doi.org/10.1177/13621688251352263\",\"RegionNum\":1,\"RegionCategory\":\"文学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"EDUCATION & EDUCATIONAL RESEARCH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Language Teaching Research","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1177/13621688251352263","RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}

引用次数: 0

摘要

b谷歌Translate （GT）已成为一种受语言学习者欢迎的机器翻译工具，教师们对其教学潜力感到兴奋，并担心它可能在课堂上被误用，特别是当这种误用未被发现时。本研究调查了自然语言处理（NLP）软件在第二语言（L2）写作中自动检测MT使用的适用性，检查了由GT生成的书面样本和中级西班牙语高等学习者产生的直接L2写作组成的数据集。基于nlp的分析发现了显著的词汇和句子水平差异，以及文本类型之间估计的熟练程度差异。基于词汇多样性和协调量的自动判断分别产生了73.08%的检测准确率，而熟练程度估计提供了正确的自动判断，总体准确率为86.54%。使用概率估计的自动反翻译协议能够在98%的时间内区分直接的L2写作和mt辅助文本，远远超过先前对相同数据集的研究中发现的人类检测率（73%）。这些发现有力地证明了nlp驱动的文本分析作为一种可靠工具的潜力，可以帮助教师在二语写作中检测未经授权的MT使用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Automated detection of machine translation use in L2 Spanish writing

Google Translate (GT) has become a popular machine translation (MT) tool among language learners, received by instructors with excitement over its pedagogical potential and concerns about its possible misuse in the classroom, particularly when this misuse goes undetected. This study investigated the suitability of natural language processing (NLP) software for the automated detection of MT use in second language (L2) writing, examining a dataset composed of written samples generated by GT and direct L2 writing produced by intermediate-level postsecondary learners of Spanish. NLP-powered analyses found significant lexical and sentential-level differences, as well as estimated proficiency-level differences across text types. Automated judgments based on lexical diversity and amount of coordination yielded detection accuracy rates of 73.08% each, whereas proficiency estimates informed correct automated judgments with an overall accuracy rate of 86.54%. An automated reverse-translation protocol using probability estimates was capable of differentiating between direct L2 writing and MT-assisted texts 98% of the time, far surpassing human detection rates (73%) found in a previous study for the same dataset. These findings argue strongly for the potential of NLP-driven textual analysis as a reliable tool to assist instructors in detecting unauthorized uses of MT in L2 writing.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Language Teaching Research Multiple-

CiteScore

13.20

自引率

7.10%

发文量

116

期刊介绍： Language Teaching Research is a peer-reviewed journal that publishes research within the area of second or foreign language teaching. Although articles are written in English, the journal welcomes studies dealing with the teaching of languages other than English as well. The journal is a venue for studies that demonstrate sound research methods and which report findings that have clear pedagogical implications. A wide range of topics in the area of language teaching is covered, including: -Programme -Syllabus -Materials design -Methodology -The teaching of specific skills and language for specific purposes Thorough investigation and research ensures this journal is: -International in focus, publishing work from countries worldwide -Interdisciplinary, encouraging work which seeks to break down barriers that have isolated language teaching professionals from others concerned with pedagogy -Innovative, seeking to stimulate new avenues of enquiry, including ''action'' research