在VarDial GDI 2017上:测试各种机器学习工具用于瑞士德语方言分类

Workshop on NLP for Similar Languages, Varieties and Dialects Pub Date : 2017-04-03 DOI:10.18653/v1/W17-1221

S. Clematide, Peter Makarov

{"title":"在VarDial GDI 2017上:测试各种机器学习工具用于瑞士德语方言分类","authors":"S. Clematide, Peter Makarov","doi":"10.18653/v1/W17-1221","DOIUrl":null,"url":null,"abstract":"Our submissions for the GDI 2017 Shared Task are the results from three different types of classifiers: Naïve Bayes, Conditional Random Fields (CRF), and Support Vector Machine (SVM). Our CRF-based run achieves a weighted F1 score of 65% (third rank) being beaten by the best system by 0.9%. Measured by classification accuracy, our ensemble run (Naïve Bayes, CRF, SVM) reaches 67% (second rank) being 1% lower than the best system. We also describe our experiments with Recurrent Neural Network (RNN) architectures. Since they performed worse than our non-neural approaches we did not include them in the submission.","PeriodicalId":167439,"journal":{"name":"Workshop on NLP for Similar Languages, Varieties and Dialects","volume":"90 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"CLUZH at VarDial GDI 2017: Testing a Variety of Machine Learning Tools for the Classification of Swiss German Dialects\",\"authors\":\"S. Clematide, Peter Makarov\",\"doi\":\"10.18653/v1/W17-1221\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Our submissions for the GDI 2017 Shared Task are the results from three different types of classifiers: Naïve Bayes, Conditional Random Fields (CRF), and Support Vector Machine (SVM). Our CRF-based run achieves a weighted F1 score of 65% (third rank) being beaten by the best system by 0.9%. Measured by classification accuracy, our ensemble run (Naïve Bayes, CRF, SVM) reaches 67% (second rank) being 1% lower than the best system. We also describe our experiments with Recurrent Neural Network (RNN) architectures. Since they performed worse than our non-neural approaches we did not include them in the submission.\",\"PeriodicalId\":167439,\"journal\":{\"name\":\"Workshop on NLP for Similar Languages, Varieties and Dialects\",\"volume\":\"90 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-04-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Workshop on NLP for Similar Languages, Varieties and Dialects\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18653/v1/W17-1221\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop on NLP for Similar Languages, Varieties and Dialects","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/W17-1221","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 16

摘要

我们提交的GDI 2017共享任务是三种不同类型分类器的结果:Naïve贝叶斯，条件随机场(CRF)和支持向量机(SVM)。我们基于crf的运行达到了65%的F1加权得分(第三名)，被最好的系统击败了0.9%。通过分类准确率来衡量，我们的集成运行(Naïve Bayes, CRF, SVM)达到67%(第二等级)，比最佳系统低1%。我们还描述了我们用递归神经网络(RNN)架构进行的实验。由于它们的表现比我们的非神经方法差，所以我们没有将它们包括在提交中。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

CLUZH at VarDial GDI 2017: Testing a Variety of Machine Learning Tools for the Classification of Swiss German Dialects

Our submissions for the GDI 2017 Shared Task are the results from three different types of classifiers: Naïve Bayes, Conditional Random Fields (CRF), and Support Vector Machine (SVM). Our CRF-based run achieves a weighted F1 score of 65% (third rank) being beaten by the best system by 0.9%. Measured by classification accuracy, our ensemble run (Naïve Bayes, CRF, SVM) reaches 67% (second rank) being 1% lower than the best system. We also describe our experiments with Recurrent Neural Network (RNN) architectures. Since they performed worse than our non-neural approaches we did not include them in the submission.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Workshop on NLP for Similar Languages, Varieties and Dialects

自引率

0.00%

发文量