{"title":"Evaluating the linguistic complexity of machine translation and LLMs for EFL/ESL applications: An entropy weight method","authors":"Yingqi Huang, Dechao Li, Andrew K.F. Cheung","doi":"10.1016/j.rmal.2025.100229","DOIUrl":null,"url":null,"abstract":"<div><div>English as a Foreign and Second Language (EFL/ESL) learners are increasingly using machine translation (MT) tools such as neural machine translations (NMTs) and large language models (LLMs) to enhance their language learning and translation processes due to their accuracy and efficiency in both cost and time compared with human translation. Given the distinct linguistic features exhibited by NMTs and LLMs, it is crucial to assess the linguistic complexity of texts produced by these tools to optimize their use in EFL/ESL teaching and learning. This study examines two forms of absolute linguistic complexity, namely lexical complexity and syntactic complexity, that influence EFL/ESL activities. Lexical complexity affects vocabulary recognition and semantic processing, while syntactic complexity influences sentence parsing and the internalization of grammatical rules. As both dimensions are multi-faceted and involve numerous indices that may vary in different directions (e.g., high values in certain measures and lower in others), an entropy weight method (EWM) is employed to assign data-driven weights and derive a balanced holistic complexity score. This approach enables a systematic comparison of translation outputs from NMTs (Google Translate, DeepL) and LLMs (ChatGPT-4o, OpenAI-o1). The findings reveal that LLMs generally exhibit higher holistic linguistic complexity, whereas NMTs tend to produce simpler translations. Pedagogically, LLM-translated texts may serve as more effective input for advanced language learners in EFL/ESL contexts, while NMT outputs may be more suitable for those with less linguistic proficiency.</div></div>","PeriodicalId":101075,"journal":{"name":"Research Methods in Applied Linguistics","volume":"4 3","pages":"Article 100229"},"PeriodicalIF":0.0000,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research Methods in Applied Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772766125000503","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
English as a Foreign and Second Language (EFL/ESL) learners are increasingly using machine translation (MT) tools such as neural machine translations (NMTs) and large language models (LLMs) to enhance their language learning and translation processes due to their accuracy and efficiency in both cost and time compared with human translation. Given the distinct linguistic features exhibited by NMTs and LLMs, it is crucial to assess the linguistic complexity of texts produced by these tools to optimize their use in EFL/ESL teaching and learning. This study examines two forms of absolute linguistic complexity, namely lexical complexity and syntactic complexity, that influence EFL/ESL activities. Lexical complexity affects vocabulary recognition and semantic processing, while syntactic complexity influences sentence parsing and the internalization of grammatical rules. As both dimensions are multi-faceted and involve numerous indices that may vary in different directions (e.g., high values in certain measures and lower in others), an entropy weight method (EWM) is employed to assign data-driven weights and derive a balanced holistic complexity score. This approach enables a systematic comparison of translation outputs from NMTs (Google Translate, DeepL) and LLMs (ChatGPT-4o, OpenAI-o1). The findings reveal that LLMs generally exhibit higher holistic linguistic complexity, whereas NMTs tend to produce simpler translations. Pedagogically, LLM-translated texts may serve as more effective input for advanced language learners in EFL/ESL contexts, while NMT outputs may be more suitable for those with less linguistic proficiency.