MADAR阿拉伯语方言识别共享任务的JHU系统描述

WANLP@ACL 2019 Pub Date : 2019-08-01 DOI:10.18653/v1/W19-4634

Thomas Lippincott, Pamela Shapiro, Kevin Duh, Paul McNamee

{"title":"MADAR阿拉伯语方言识别共享任务的JHU系统描述","authors":"Thomas Lippincott, Pamela Shapiro, Kevin Duh, Paul McNamee","doi":"10.18653/v1/W19-4634","DOIUrl":null,"url":null,"abstract":"Our submission to the MADAR shared task on Arabic dialect identification employed a language modeling technique called Prediction by Partial Matching, an ensemble of neural architectures, and sources of additional data for training word embeddings and auxiliary language models. We found several of these techniques provided small boosts in performance, though a simple character-level language model was a strong baseline, and a lower-order LM achieved best performance on Subtask 2. Interestingly, word embeddings provided no consistent benefit, and ensembling struggled to outperform the best component submodel. This suggests the variety of architectures are learning redundant information, and future work may focus on encouraging decorrelated learning.","PeriodicalId":268163,"journal":{"name":"WANLP@ACL 2019","volume":"74 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"JHU System Description for the MADAR Arabic Dialect Identification Shared Task\",\"authors\":\"Thomas Lippincott, Pamela Shapiro, Kevin Duh, Paul McNamee\",\"doi\":\"10.18653/v1/W19-4634\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Our submission to the MADAR shared task on Arabic dialect identification employed a language modeling technique called Prediction by Partial Matching, an ensemble of neural architectures, and sources of additional data for training word embeddings and auxiliary language models. We found several of these techniques provided small boosts in performance, though a simple character-level language model was a strong baseline, and a lower-order LM achieved best performance on Subtask 2. Interestingly, word embeddings provided no consistent benefit, and ensembling struggled to outperform the best component submodel. This suggests the variety of architectures are learning redundant information, and future work may focus on encouraging decorrelated learning.\",\"PeriodicalId\":268163,\"journal\":{\"name\":\"WANLP@ACL 2019\",\"volume\":\"74 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"WANLP@ACL 2019\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18653/v1/W19-4634\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"WANLP@ACL 2019","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/W19-4634","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

我们提交给MADAR的关于阿拉伯语方言识别的共享任务使用了一种称为“部分匹配预测”的语言建模技术，一个神经架构的集合，以及用于训练词嵌入和辅助语言模型的额外数据来源。我们发现这些技术中有几种在性能上提供了小幅提升，尽管一个简单的字符级语言模型是一个强大的基线，而一个低阶LM在Subtask 2上实现了最佳性能。有趣的是，词嵌入并没有提供一致的好处，而集成很难胜过最好的组件子模型。这表明各种架构都在学习冗余信息，未来的工作可能会集中在鼓励去相关学习上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

JHU System Description for the MADAR Arabic Dialect Identification Shared Task

Our submission to the MADAR shared task on Arabic dialect identification employed a language modeling technique called Prediction by Partial Matching, an ensemble of neural architectures, and sources of additional data for training word embeddings and auxiliary language models. We found several of these techniques provided small boosts in performance, though a simple character-level language model was a strong baseline, and a lower-order LM achieved best performance on Subtask 2. Interestingly, word embeddings provided no consistent benefit, and ensembling struggled to outperform the best component submodel. This suggests the variety of architectures are learning redundant information, and future work may focus on encouraging decorrelated learning.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

WANLP@ACL 2019

自引率

0.00%

发文量