Zero-Shot Cross-Lingual Named Entity Recognition via Progressive Multi-Teacher Distillation

IF 4.1 2区 计算机科学 Q1 ACOUSTICS
Zhuoran Li;Chunming Hu;Richong Zhang;Junfan Chen;Xiaohui Guo
{"title":"Zero-Shot Cross-Lingual Named Entity Recognition via Progressive Multi-Teacher Distillation","authors":"Zhuoran Li;Chunming Hu;Richong Zhang;Junfan Chen;Xiaohui Guo","doi":"10.1109/TASLP.2024.3449029","DOIUrl":null,"url":null,"abstract":"Cross-lingual learning aims to transfer knowledge from one natural language to another. Zero-shot cross-lingual named entity recognition (NER) tasks are to train an NER model on source languages and to identify named entities in other languages. Existing knowledge distillation-based models in a teacher-student manner leverage the unlabeled samples from the target languages and show their superiority in this setting. However, the valuable similarity information between tokens in the target language is ignored. And the teacher model trained solely on the source language generates low-quality pseudo-labels. These two facts impact the performance of cross-lingual NER. To improve the reliability of the teacher model, in this study, we first introduce one extra simple binary classification teacher model by similarity learning to measure if the inputs are from the same class. We note that this binary classification auxiliary task is easier, and the two teachers simultaneously supervise the student model for better performance. Furthermore, given such a stronger student model, we propose a progressive knowledge distillation framework that extensively fine-tunes the teacher model on the target-language pseudo-labels generated by the student model. Empirical studies on three datasets across seven different languages show that our presented model outperforms state-of-the-art methods.","PeriodicalId":13332,"journal":{"name":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","volume":"32 ","pages":"4617-4630"},"PeriodicalIF":4.1000,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10645066/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0

Abstract

Cross-lingual learning aims to transfer knowledge from one natural language to another. Zero-shot cross-lingual named entity recognition (NER) tasks are to train an NER model on source languages and to identify named entities in other languages. Existing knowledge distillation-based models in a teacher-student manner leverage the unlabeled samples from the target languages and show their superiority in this setting. However, the valuable similarity information between tokens in the target language is ignored. And the teacher model trained solely on the source language generates low-quality pseudo-labels. These two facts impact the performance of cross-lingual NER. To improve the reliability of the teacher model, in this study, we first introduce one extra simple binary classification teacher model by similarity learning to measure if the inputs are from the same class. We note that this binary classification auxiliary task is easier, and the two teachers simultaneously supervise the student model for better performance. Furthermore, given such a stronger student model, we propose a progressive knowledge distillation framework that extensively fine-tunes the teacher model on the target-language pseudo-labels generated by the student model. Empirical studies on three datasets across seven different languages show that our presented model outperforms state-of-the-art methods.
通过多教师渐进式提炼实现零镜头跨语言命名实体识别
跨语言学习旨在将知识从一种自然语言转移到另一种自然语言。零点跨语言命名实体识别(NER)任务是在源语言上训练 NER 模型,并识别其他语言中的命名实体。现有的基于知识提炼的模型以教师-学生的方式利用来自目标语言的未标记样本,并在这种情况下显示出其优越性。然而,目标语言中标记之间有价值的相似性信息却被忽略了。而且,仅根据源语言训练的教师模型会生成低质量的伪标签。这两个事实影响了跨语言 NER 的性能。为了提高教师模型的可靠性,在本研究中,我们首先通过相似性学习引入了一个额外的简单二元分类教师模型,以衡量输入是否来自同一类别。我们注意到,这种二元分类辅助任务比较简单,而且两个教师同时监督学生模型,可以获得更好的性能。此外,在学生模型更强的情况下,我们提出了一个渐进式知识提炼框架,在学生模型生成的目标语言伪标签上对教师模型进行广泛的微调。在七个不同语言的三个数据集上进行的实证研究表明,我们提出的模型优于最先进的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE/ACM Transactions on Audio, Speech, and Language Processing
IEEE/ACM Transactions on Audio, Speech, and Language Processing ACOUSTICS-ENGINEERING, ELECTRICAL & ELECTRONIC
CiteScore
11.30
自引率
11.10%
发文量
217
期刊介绍: The IEEE/ACM Transactions on Audio, Speech, and Language Processing covers audio, speech and language processing and the sciences that support them. In audio processing: transducers, room acoustics, active sound control, human audition, analysis/synthesis/coding of music, and consumer audio. In speech processing: areas such as speech analysis, synthesis, coding, speech and speaker recognition, speech production and perception, and speech enhancement. In language processing: speech and text analysis, understanding, generation, dialog management, translation, summarization, question answering and document indexing and retrieval, as well as general language modeling.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信