{"title":"Inferring Latent Attributes of an Indian Twitter User using Celebrities and Class Influencers","authors":"Puneet Singh Ludu","doi":"10.1145/2806655.2806657","DOIUrl":null,"url":null,"abstract":"In this paper we classify a user into three categories: \"Gender\", \"Age\" and \"Political Affiliation\" with an application to Indian Twitter users. Our approach automatically predicts these attributes by leveraging observable information such as the tweet behavior, linguistic content of the user's Twitter feed and the celebrities followed by the user. This paper would also use a novel feature that we would define in this paper as \"class influencers\". Class influencers are the Twitter users which influence a particular class so much that, they themselves can be used as a discriminating feature. Our approach first extracts the linguistic content based features using LIWC dictionary. Then, we derive features like smiley types, smiley count, tweet frequency, night-time tweet frequency, etc. We have also derived celebrity based feature: age, genre, gender (using Wikipedia and Freebase) of the celebrities a user is following. Finally, we refine the results using class influencers. Results show that rich linguistic features combined with popular neighborhood and influencers prove valuables and promising for additional user classification needs.","PeriodicalId":112658,"journal":{"name":"Proceedings of the 1st ACM Workshop on Social Media World Sensors","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1st ACM Workshop on Social Media World Sensors","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2806655.2806657","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
In this paper we classify a user into three categories: "Gender", "Age" and "Political Affiliation" with an application to Indian Twitter users. Our approach automatically predicts these attributes by leveraging observable information such as the tweet behavior, linguistic content of the user's Twitter feed and the celebrities followed by the user. This paper would also use a novel feature that we would define in this paper as "class influencers". Class influencers are the Twitter users which influence a particular class so much that, they themselves can be used as a discriminating feature. Our approach first extracts the linguistic content based features using LIWC dictionary. Then, we derive features like smiley types, smiley count, tweet frequency, night-time tweet frequency, etc. We have also derived celebrity based feature: age, genre, gender (using Wikipedia and Freebase) of the celebrities a user is following. Finally, we refine the results using class influencers. Results show that rich linguistic features combined with popular neighborhood and influencers prove valuables and promising for additional user classification needs.