Ling Ge , Chunming Hu , Guanghui Ma , Hong Zhang , Jihong Liu
{"title":"Den-ML: Multi-source cross-lingual transfer via denoising mutual learning","authors":"Ling Ge , Chunming Hu , Guanghui Ma , Hong Zhang , Jihong Liu","doi":"10.1016/j.ipm.2024.103834","DOIUrl":null,"url":null,"abstract":"<div><p>Multi-source cross-lingual transfer aims to acquire task knowledge from multiple labelled source languages and transfer it to an unlabelled target language, enabling effective performance in this target language. The existing methods mainly focus on weighting predictions of language-specific classifiers trained in source languages to derive final results for target samples. However, we argue that, due to the language gap, language-specific classifiers inevitably generate many noisy predictions for target samples. Furthermore, these methods disregard the mutual guidance and utilization of knowledge among multiple source languages. To address these issues, we propose a novel model, Den-ML, which improves the model’s performance in multi-source scenarios through two perspectives: reducing prediction noise of language-specific classifiers and prompting mutual learning among these classifiers. Firstly, Den-ML devises a discrepancy-guided denoising learning method to learn discriminative representations for the target language, thus mitigating the noise prediction of classifiers. Secondly, Den-ML develops a pseudo-label-supervised mutual learning method, which relies on forcing probability distribution interactions among multiple language-specific classifiers for knowledge transfer, thus achieving mutual learning among classifiers. We conduct experiments on three different tasks, named entity recognition, paraphrase identification and natural language inference, with two different multi-source combination settings (same- and different-family settings) covering 39 languages. Our approach outperforms the benchmark and the SOTA model in most metrics for all three tasks in different settings. In addition, we perform ablation, visualization and analysis experiments on three different tasks, and the experimental results validate the effectiveness of our approach.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4000,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457324001936","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Multi-source cross-lingual transfer aims to acquire task knowledge from multiple labelled source languages and transfer it to an unlabelled target language, enabling effective performance in this target language. The existing methods mainly focus on weighting predictions of language-specific classifiers trained in source languages to derive final results for target samples. However, we argue that, due to the language gap, language-specific classifiers inevitably generate many noisy predictions for target samples. Furthermore, these methods disregard the mutual guidance and utilization of knowledge among multiple source languages. To address these issues, we propose a novel model, Den-ML, which improves the model’s performance in multi-source scenarios through two perspectives: reducing prediction noise of language-specific classifiers and prompting mutual learning among these classifiers. Firstly, Den-ML devises a discrepancy-guided denoising learning method to learn discriminative representations for the target language, thus mitigating the noise prediction of classifiers. Secondly, Den-ML develops a pseudo-label-supervised mutual learning method, which relies on forcing probability distribution interactions among multiple language-specific classifiers for knowledge transfer, thus achieving mutual learning among classifiers. We conduct experiments on three different tasks, named entity recognition, paraphrase identification and natural language inference, with two different multi-source combination settings (same- and different-family settings) covering 39 languages. Our approach outperforms the benchmark and the SOTA model in most metrics for all three tasks in different settings. In addition, we perform ablation, visualization and analysis experiments on three different tasks, and the experimental results validate the effectiveness of our approach.
期刊介绍:
Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing.
We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.