{"title":"基于知识的波斯语分布式语义扩展词义消歧","authors":"H. Rouhizadeh, M. Shamsfard, Masoud Rouhizadeh","doi":"10.1109/ICCKE50421.2020.9303675","DOIUrl":null,"url":null,"abstract":"Word Sense Disambiguation (WSD) can be the key component of downstream NLP applications. Existing WSD methods and systems are mostly developed and evaluated on English and low-resource languages such as Persian have not been well studied. In this paper, we propose a new knowledge-based method for Persian WSD. Using a pre-trained LDA model, we retrieve the topics of each document and assign each ambiguous content word to one of the topics. For each possible sense s of a given word w, we compute the similarity between the FarsNet (the Persian WordNet) gloss of s and the words of the assigned topic of w. We then choose the sense with the highest score as the most probable one. We evaluated our method on a Persian all-words WSD dataset and show that, compared to other knowledge-based methods, we could achieve state-of-the-art performance.","PeriodicalId":402043,"journal":{"name":"2020 10th International Conference on Computer and Knowledge Engineering (ICCKE)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Knowledge Based Word Sense Disambiguation with Distributional Semantic Expansion for the Persian Language\",\"authors\":\"H. Rouhizadeh, M. Shamsfard, Masoud Rouhizadeh\",\"doi\":\"10.1109/ICCKE50421.2020.9303675\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Word Sense Disambiguation (WSD) can be the key component of downstream NLP applications. Existing WSD methods and systems are mostly developed and evaluated on English and low-resource languages such as Persian have not been well studied. In this paper, we propose a new knowledge-based method for Persian WSD. Using a pre-trained LDA model, we retrieve the topics of each document and assign each ambiguous content word to one of the topics. For each possible sense s of a given word w, we compute the similarity between the FarsNet (the Persian WordNet) gloss of s and the words of the assigned topic of w. We then choose the sense with the highest score as the most probable one. We evaluated our method on a Persian all-words WSD dataset and show that, compared to other knowledge-based methods, we could achieve state-of-the-art performance.\",\"PeriodicalId\":402043,\"journal\":{\"name\":\"2020 10th International Conference on Computer and Knowledge Engineering (ICCKE)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 10th International Conference on Computer and Knowledge Engineering (ICCKE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCKE50421.2020.9303675\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 10th International Conference on Computer and Knowledge Engineering (ICCKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCKE50421.2020.9303675","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Knowledge Based Word Sense Disambiguation with Distributional Semantic Expansion for the Persian Language
Word Sense Disambiguation (WSD) can be the key component of downstream NLP applications. Existing WSD methods and systems are mostly developed and evaluated on English and low-resource languages such as Persian have not been well studied. In this paper, we propose a new knowledge-based method for Persian WSD. Using a pre-trained LDA model, we retrieve the topics of each document and assign each ambiguous content word to one of the topics. For each possible sense s of a given word w, we compute the similarity between the FarsNet (the Persian WordNet) gloss of s and the words of the assigned topic of w. We then choose the sense with the highest score as the most probable one. We evaluated our method on a Persian all-words WSD dataset and show that, compared to other knowledge-based methods, we could achieve state-of-the-art performance.