{"title":"基于机器学习的英语学习者翻译错误分类与分析","authors":"Ying Qin","doi":"10.4018/IJCALLT.2019070105","DOIUrl":null,"url":null,"abstract":"This study extracts the comments from a large scale of Chinese EFL learners' translation corpus to study the taxonomy of translation errors. Two unsupervised machine learning approaches are used to obtain the computational evidences of translation error taxonomy. After manually revision, ten types of English to Chinese (E2C) and eight types Chinese to English (C2E) translation errors are finally confirmed. There probably exists three categories of top-level errors according to the hierarchical clustering results. In addition, three supervised learning methods are applied to automatically recognize the types of errors, among which the highest performance reaches F1 = 0.85 on E2C and F1 = 0.90 on C2E translation. Further comparison to the intuitive or theoretical studies on translation taxonomy shows some phenomenon accompanied by language skill improvement of Chinese learners. Analysis on translation problems based on machine learning provides the objective insight and understanding on the students' translations.","PeriodicalId":43610,"journal":{"name":"International Journal of Computer-Assisted Language Learning and Teaching","volume":null,"pages":null},"PeriodicalIF":0.7000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Machine Learning Based Taxonomy and Analysis of English Learners' Translation Errors\",\"authors\":\"Ying Qin\",\"doi\":\"10.4018/IJCALLT.2019070105\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This study extracts the comments from a large scale of Chinese EFL learners' translation corpus to study the taxonomy of translation errors. Two unsupervised machine learning approaches are used to obtain the computational evidences of translation error taxonomy. After manually revision, ten types of English to Chinese (E2C) and eight types Chinese to English (C2E) translation errors are finally confirmed. There probably exists three categories of top-level errors according to the hierarchical clustering results. In addition, three supervised learning methods are applied to automatically recognize the types of errors, among which the highest performance reaches F1 = 0.85 on E2C and F1 = 0.90 on C2E translation. Further comparison to the intuitive or theoretical studies on translation taxonomy shows some phenomenon accompanied by language skill improvement of Chinese learners. Analysis on translation problems based on machine learning provides the objective insight and understanding on the students' translations.\",\"PeriodicalId\":43610,\"journal\":{\"name\":\"International Journal of Computer-Assisted Language Learning and Teaching\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.7000,\"publicationDate\":\"2019-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Computer-Assisted Language Learning and Teaching\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4018/IJCALLT.2019070105\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"EDUCATION & EDUCATIONAL RESEARCH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer-Assisted Language Learning and Teaching","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/IJCALLT.2019070105","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
Machine Learning Based Taxonomy and Analysis of English Learners' Translation Errors
This study extracts the comments from a large scale of Chinese EFL learners' translation corpus to study the taxonomy of translation errors. Two unsupervised machine learning approaches are used to obtain the computational evidences of translation error taxonomy. After manually revision, ten types of English to Chinese (E2C) and eight types Chinese to English (C2E) translation errors are finally confirmed. There probably exists three categories of top-level errors according to the hierarchical clustering results. In addition, three supervised learning methods are applied to automatically recognize the types of errors, among which the highest performance reaches F1 = 0.85 on E2C and F1 = 0.90 on C2E translation. Further comparison to the intuitive or theoretical studies on translation taxonomy shows some phenomenon accompanied by language skill improvement of Chinese learners. Analysis on translation problems based on machine learning provides the objective insight and understanding on the students' translations.
期刊介绍:
The mission of the International Journal of Computer-Assisted Language Learning and Teaching (IJCALLT) is to publish research, theory, and conceptually-based papers that address the use and impact of and innovations in education technologies in advancing foreign/second language learning and teaching. This journal expands on the principles, theories, designs, discussion, and implementations of computer-assisted language learning. In addition to original research papers and submissions on theory and concept development and systematic reports of practice, this journal welcomes theory-based CALL-related book and software/application reviews.