{"title":"Discriminating gender on Chinese microblog: A study of online behaviour, writing style and preferred vocabulary","authors":"Li Li, Maosong Sun, Zhiyuan Liu","doi":"10.1109/ICNC.2014.6975942","DOIUrl":null,"url":null,"abstract":"As user attributes are useful for applications such as personalized recommendation, adverting and so on, user attribute predication on Twitter has attracted intensive attentions in recent years. Although Chinese micro-blogging services are different from Twitter on various aspects such as language, user behaviours and so on, few efforts have been made on Chinese micro-blogging services. In this paper, we propose a gender prediction model for Chinese microblog which exploits features including online behaviour, writing style, and preferred vocabulary. Experimental results on Sina Weibo, which is one of the most popular micro-blogging services in China, show that our model achieves the state-of-the-art accuracy 94.3%. We also find significant distinctions between male and female microblog users on online behaviour, writing style and preferred vocabulary, which would be helpful for improving personalized applications.","PeriodicalId":208779,"journal":{"name":"2014 10th International Conference on Natural Computation (ICNC)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 10th International Conference on Natural Computation (ICNC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNC.2014.6975942","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
As user attributes are useful for applications such as personalized recommendation, adverting and so on, user attribute predication on Twitter has attracted intensive attentions in recent years. Although Chinese micro-blogging services are different from Twitter on various aspects such as language, user behaviours and so on, few efforts have been made on Chinese micro-blogging services. In this paper, we propose a gender prediction model for Chinese microblog which exploits features including online behaviour, writing style, and preferred vocabulary. Experimental results on Sina Weibo, which is one of the most popular micro-blogging services in China, show that our model achieves the state-of-the-art accuracy 94.3%. We also find significant distinctions between male and female microblog users on online behaviour, writing style and preferred vocabulary, which would be helpful for improving personalized applications.