{"title":"Learning Word Representations with Deep Neural Networks for Turkish","authors":"E. Dündar, Ethem Alpaydin","doi":"10.1109/SIU.2019.8806491","DOIUrl":null,"url":null,"abstract":"We test different word embedding methods in Turkish. The goal is to represent related words in a high dimensional space such that their positions reflect this relationship. We compare word2vec, fastText, and ELMo on three Turkish corpora of different sizes. Word2vec works at the word level, fastText works at the character level; ELMo, unlike the other two, is context dependent. Our experiments show that fastText is better on name and verb inflection, and word2vec is better on semantic/syntactic analogy tasks. Bag-of-words model is better than most trained word embedding models on classification.","PeriodicalId":326275,"journal":{"name":"2019 27th Signal Processing and Communications Applications Conference (SIU)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 27th Signal Processing and Communications Applications Conference (SIU)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIU.2019.8806491","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
We test different word embedding methods in Turkish. The goal is to represent related words in a high dimensional space such that their positions reflect this relationship. We compare word2vec, fastText, and ELMo on three Turkish corpora of different sizes. Word2vec works at the word level, fastText works at the character level; ELMo, unlike the other two, is context dependent. Our experiments show that fastText is better on name and verb inflection, and word2vec is better on semantic/syntactic analogy tasks. Bag-of-words model is better than most trained word embedding models on classification.