{"title":"Analysis of The Characteristics of Similar Words Computed by Word Embeddings","authors":"Shuhui Zhou, Peihan Liu, Lizhen Liu, Wei Song, Miaomiao Cheng","doi":"10.1109/ICEIEC49280.2020.9152307","DOIUrl":null,"url":null,"abstract":"Word2vec is a popular word embedding technique and has also gained a lot of attention in the NLP field. But word embedding based on distributed representation is deficient in the semantics of distribution. This defect often occurs when we use word similarity to find similar words of a seed word. This article analyzes these similar words based on this deficiency. We propose a novel classification criterion to effectively classify similar words into 7 categories. Finally, we listed the future research directions, hoping to solve the problem of word confusion effectively.","PeriodicalId":352285,"journal":{"name":"2020 IEEE 10th International Conference on Electronics Information and Emergency Communication (ICEIEC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 10th International Conference on Electronics Information and Emergency Communication (ICEIEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEIEC49280.2020.9152307","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Word2vec is a popular word embedding technique and has also gained a lot of attention in the NLP field. But word embedding based on distributed representation is deficient in the semantics of distribution. This defect often occurs when we use word similarity to find similar words of a seed word. This article analyzes these similar words based on this deficiency. We propose a novel classification criterion to effectively classify similar words into 7 categories. Finally, we listed the future research directions, hoping to solve the problem of word confusion effectively.