{"title":"A Study on the CBOW Model's Overfitting and Stability","authors":"Qun Luo, Weiran Xu, Jun Guo","doi":"10.1145/2663792.2663793","DOIUrl":null,"url":null,"abstract":"Word vectors are distributed representations of word features. Continuous Bag-of-Words Model(CBOW) is a state-of-the-art model for learning word vectors, yet can be ameliorated for learning better word vectors because we find that CBOW is vulnerable to be overfitted and unstable. We use two methods to solve these two problems so that CBOW can learn better word vectors. In this study, we add the regularized structure risk summation to the objective function of the CBOW model and propose inverse word frequency encoding for the CBOW model. Our proposed methods substantially improve the quality of word vectors, boosting r from 0.638 to 0.696 for word relatedness and total accuracy from 30.80% to 38.43% for word pairs relationship relatedness regarding to 52 million training words with 200 dimensionality.","PeriodicalId":289794,"journal":{"name":"Web-KR '14","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Web-KR '14","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2663792.2663793","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 25
Abstract
Word vectors are distributed representations of word features. Continuous Bag-of-Words Model(CBOW) is a state-of-the-art model for learning word vectors, yet can be ameliorated for learning better word vectors because we find that CBOW is vulnerable to be overfitted and unstable. We use two methods to solve these two problems so that CBOW can learn better word vectors. In this study, we add the regularized structure risk summation to the objective function of the CBOW model and propose inverse word frequency encoding for the CBOW model. Our proposed methods substantially improve the quality of word vectors, boosting r from 0.638 to 0.696 for word relatedness and total accuracy from 30.80% to 38.43% for word pairs relationship relatedness regarding to 52 million training words with 200 dimensionality.