{"title":"基于语义TF-IDF的微博哈希标签推荐系统:Twitter用例","authors":"M. S. Tajbakhsh, J. Bagherzadeh","doi":"10.1109/W-FiCloud.2016.59","DOIUrl":null,"url":null,"abstract":"Limitation in the number of characters in microblogging systems, such as Twitter, forces users to use various terms for the same meaning, object, or concept. Sometimes the same term is used in a shorter form (e.g. #friend and #frnd) in a tweet. This problem makes finding similarities between such tags and their corresponding tweets harder. The classical text mining methods cannot be used efficiently in the short tweets. Thus tweets similarity and subsequently tag recommendation, as one of the problems in microblogging social networks, needs a new method with higher efficiency. In this paper we have defined a new semantic based method to find similarities among short messages. We have modeled each short message as a semantic vector which can be used along with any similarity method such as cosine similarity. Then we evaluated the accuracy of the new semantic similarity based tag recommendation system using various semantic based algorithms and compare their results. The semantic based algorithms used are: Shortest Path, Wu & Palmer, Lin, JiangConrath, Resnik, Lesk, LeacockChodorow, and Hirst-StOnge. Results are evaluated using 8396744 real English tweets and show around 6 times improvement in accuracy over normal TF-IDF.","PeriodicalId":441441,"journal":{"name":"2016 IEEE 4th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":"{\"title\":\"Microblogging Hash Tag Recommendation System Based on Semantic TF-IDF: Twitter Use Case\",\"authors\":\"M. S. Tajbakhsh, J. Bagherzadeh\",\"doi\":\"10.1109/W-FiCloud.2016.59\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Limitation in the number of characters in microblogging systems, such as Twitter, forces users to use various terms for the same meaning, object, or concept. Sometimes the same term is used in a shorter form (e.g. #friend and #frnd) in a tweet. This problem makes finding similarities between such tags and their corresponding tweets harder. The classical text mining methods cannot be used efficiently in the short tweets. Thus tweets similarity and subsequently tag recommendation, as one of the problems in microblogging social networks, needs a new method with higher efficiency. In this paper we have defined a new semantic based method to find similarities among short messages. We have modeled each short message as a semantic vector which can be used along with any similarity method such as cosine similarity. Then we evaluated the accuracy of the new semantic similarity based tag recommendation system using various semantic based algorithms and compare their results. The semantic based algorithms used are: Shortest Path, Wu & Palmer, Lin, JiangConrath, Resnik, Lesk, LeacockChodorow, and Hirst-StOnge. Results are evaluated using 8396744 real English tweets and show around 6 times improvement in accuracy over normal TF-IDF.\",\"PeriodicalId\":441441,\"journal\":{\"name\":\"2016 IEEE 4th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"25\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE 4th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/W-FiCloud.2016.59\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 4th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/W-FiCloud.2016.59","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Microblogging Hash Tag Recommendation System Based on Semantic TF-IDF: Twitter Use Case
Limitation in the number of characters in microblogging systems, such as Twitter, forces users to use various terms for the same meaning, object, or concept. Sometimes the same term is used in a shorter form (e.g. #friend and #frnd) in a tweet. This problem makes finding similarities between such tags and their corresponding tweets harder. The classical text mining methods cannot be used efficiently in the short tweets. Thus tweets similarity and subsequently tag recommendation, as one of the problems in microblogging social networks, needs a new method with higher efficiency. In this paper we have defined a new semantic based method to find similarities among short messages. We have modeled each short message as a semantic vector which can be used along with any similarity method such as cosine similarity. Then we evaluated the accuracy of the new semantic similarity based tag recommendation system using various semantic based algorithms and compare their results. The semantic based algorithms used are: Shortest Path, Wu & Palmer, Lin, JiangConrath, Resnik, Lesk, LeacockChodorow, and Hirst-StOnge. Results are evaluated using 8396744 real English tweets and show around 6 times improvement in accuracy over normal TF-IDF.