Credell Simeon, Howard J. Hamilton, Robert J. Hilderman
{"title":"Word Segmentation Algorithms with Lexical Resources for Hashtag Classification","authors":"Credell Simeon, Howard J. Hamilton, Robert J. Hilderman","doi":"10.1109/DSAA.2016.80","DOIUrl":null,"url":null,"abstract":"We present a novel method for classifying hashtag types. Specifically, we apply word segmentation algorithms and lexical resources in order to classify two types of hashtags: those with sentiment information and those without. However, the complex structure of hashtags increases the difficulty of identifying sentiment information. In order to solve this problem, we segment hashtags into smaller semantic units using word segmentation algorithms in conjunction with lexical resources to classify hashtag types. Our experimental results demonstrate that our approach achieves a 14% increase in accuracy over baseline methods for identifying hashtags with sentiment information. Additionally, we achieve over 94% recall using this hashtag type for the subjectivity detection of tweets.","PeriodicalId":193885,"journal":{"name":"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","volume":"83 8","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSAA.2016.80","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
We present a novel method for classifying hashtag types. Specifically, we apply word segmentation algorithms and lexical resources in order to classify two types of hashtags: those with sentiment information and those without. However, the complex structure of hashtags increases the difficulty of identifying sentiment information. In order to solve this problem, we segment hashtags into smaller semantic units using word segmentation algorithms in conjunction with lexical resources to classify hashtag types. Our experimental results demonstrate that our approach achieves a 14% increase in accuracy over baseline methods for identifying hashtags with sentiment information. Additionally, we achieve over 94% recall using this hashtag type for the subjectivity detection of tweets.