基于机器翻译和多语言转换模型的五种语言内容评分

IF 8.5 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

International Journal of Artificial Intelligence in Education Pub Date : 2023-11-03 DOI:10.1007/s40593-023-00370-1

Andrea Horbach, Joey Pehlke, Ronja Laarmann-Quante, Yuning Ding

{"title":"基于机器翻译和多语言转换模型的五种语言内容评分","authors":"Andrea Horbach, Joey Pehlke, Ronja Laarmann-Quante, Yuning Ding","doi":"10.1007/s40593-023-00370-1","DOIUrl":null,"url":null,"abstract":"Abstract This paper investigates crosslingual content scoring, a scenario where scoring models trained on learner data in one language are applied to data in a different language. We analyze data in five different languages (Chinese, English, French, German and Spanish) collected for three prompts of the established English ASAP content scoring dataset. We cross the language barrier by means of both shallow and deep learning crosslingual classification models using both machine translation and multilingual transformer models. We find that a combination of machine translation and multilingual models outperforms each method individually - our best results are reached when combining the available data in different languages, i.e. first training a model on the large English ASAP dataset before fine-tuning on smaller amounts of training data in the target language.","PeriodicalId":46637,"journal":{"name":"International Journal of Artificial Intelligence in Education","volume":"12 4","pages":"0"},"PeriodicalIF":8.5000,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Crosslingual Content Scoring in Five Languages Using Machine-Translation and Multilingual Transformer Models\",\"authors\":\"Andrea Horbach, Joey Pehlke, Ronja Laarmann-Quante, Yuning Ding\",\"doi\":\"10.1007/s40593-023-00370-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract This paper investigates crosslingual content scoring, a scenario where scoring models trained on learner data in one language are applied to data in a different language. We analyze data in five different languages (Chinese, English, French, German and Spanish) collected for three prompts of the established English ASAP content scoring dataset. We cross the language barrier by means of both shallow and deep learning crosslingual classification models using both machine translation and multilingual transformer models. We find that a combination of machine translation and multilingual models outperforms each method individually - our best results are reached when combining the available data in different languages, i.e. first training a model on the large English ASAP dataset before fine-tuning on smaller amounts of training data in the target language.\",\"PeriodicalId\":46637,\"journal\":{\"name\":\"International Journal of Artificial Intelligence in Education\",\"volume\":\"12 4\",\"pages\":\"0\"},\"PeriodicalIF\":8.5000,\"publicationDate\":\"2023-11-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Artificial Intelligence in Education\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s40593-023-00370-1\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Artificial Intelligence in Education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s40593-023-00370-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

摘要本文研究了跨语言内容评分，即在一种语言的学习者数据上训练的评分模型应用于另一种语言的数据。我们分析了五种不同语言(中文、英文、法文、德文和西班牙文)收集的数据，这些数据来自于已建立的英文ASAP内容评分数据集的三个提示。我们通过使用机器翻译和多语言转换模型的浅学习和深度学习跨语言分类模型来跨越语言障碍。我们发现机器翻译和多语言模型的组合优于每种单独的方法——当结合不同语言的可用数据时，我们达到了最好的结果，即首先在大型英语ASAP数据集上训练模型，然后在目标语言的少量训练数据上进行微调。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Crosslingual Content Scoring in Five Languages Using Machine-Translation and Multilingual Transformer Models

Abstract This paper investigates crosslingual content scoring, a scenario where scoring models trained on learner data in one language are applied to data in a different language. We analyze data in five different languages (Chinese, English, French, German and Spanish) collected for three prompts of the established English ASAP content scoring dataset. We cross the language barrier by means of both shallow and deep learning crosslingual classification models using both machine translation and multilingual transformer models. We find that a combination of machine translation and multilingual models outperforms each method individually - our best results are reached when combining the available data in different languages, i.e. first training a model on the large English ASAP dataset before fine-tuning on smaller amounts of training data in the target language.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Artificial Intelligence in Education COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-

CiteScore

11.10

自引率

6.10%

发文量

期刊介绍： IJAIED publishes papers concerned with the application of AI to education. It aims to help the development of principles for the design of computer-based learning systems. Its premise is that such principles involve the modelling and representation of relevant aspects of knowledge, before implementation or during execution, and hence require the application of AI techniques and concepts. IJAIED has a very broad notion of the scope of AI and of a ''computer-based learning system'', as indicated by the following list of topics considered to be within the scope of IJAIED: adaptive and intelligent multimedia and hypermedia systemsagent-based learning environmentsAIED and teacher educationarchitectures for AIED systemsassessment and testing of learning outcomesauthoring systems and shells for AIED systemsbayesian and statistical methodscase-based systemscognitive developmentcognitive models of problem-solvingcognitive tools for learningcomputer-assisted language learningcomputer-supported collaborative learningdialogue (argumentation, explanation, negotiation, etc.) discovery environments and microworldsdistributed learning environmentseducational roboticsembedded training systemsempirical studies to inform the design of learning environmentsenvironments to support the learning of programmingevaluation of AIED systemsformal models of components of AIED systemshelp and advice systemshuman factors and interface designinstructional design principlesinstructional planningintelligent agents on the internetintelligent courseware for computer-based trainingintelligent tutoring systemsknowledge and skill acquisitionknowledge representation for instructionmodelling metacognitive skillsmodelling pedagogical interactionsmotivationnatural language interfaces for instructional systemsnetworked learning and teaching systemsneural models applied to AIED systemsperformance support systemspractical, real-world applications of AIED systemsqualitative reasoning in simulationssituated learning and cognitive apprenticeshipsocial and cultural aspects of learningstudent modelling and cognitive diagnosissupport for knowledge building communitiessupport for networked communicationtheories of learning and conceptual changetools for administration and curriculum integrationtools for the guided exploration of information resources