{"title":"使用大数据技术的Twitter数据分类","authors":"Madani Youness, Erritali Mohammed, Ben Jamâa","doi":"10.1145/3230348.3230368","DOIUrl":null,"url":null,"abstract":"Tweets classification or in general the classification of the social network's data is a recent field of scientific research, where researchers look for new methods to classify users data (tweets, Facebook's post...) into classes (positive, negative, neutral).This type of scientific research called sentiment analysis (SA) or opinion mining and it allows to extract the feelings, opinions or attitudes expressed in a tweet or a facebook post ... In this article, we describe how we can collect and store a large volume of data, which is in the form of tweets, in Hadoop Distributed File System (HDFS), and how we can classify these tweets using different classification methods, making a comparison between the well-known machine learning algorithms and a dictionary based-approach using the AFINN dictionary. The experimental results show that the AFINN dictionary outperforms the well-known machine learning algorithms.","PeriodicalId":188878,"journal":{"name":"Proceedings of the 2018 1st International Conference on Internet and e-Business","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Twitter Data Classification Using Big Data Technologies\",\"authors\":\"Madani Youness, Erritali Mohammed, Ben Jamâa\",\"doi\":\"10.1145/3230348.3230368\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Tweets classification or in general the classification of the social network's data is a recent field of scientific research, where researchers look for new methods to classify users data (tweets, Facebook's post...) into classes (positive, negative, neutral).This type of scientific research called sentiment analysis (SA) or opinion mining and it allows to extract the feelings, opinions or attitudes expressed in a tweet or a facebook post ... In this article, we describe how we can collect and store a large volume of data, which is in the form of tweets, in Hadoop Distributed File System (HDFS), and how we can classify these tweets using different classification methods, making a comparison between the well-known machine learning algorithms and a dictionary based-approach using the AFINN dictionary. The experimental results show that the AFINN dictionary outperforms the well-known machine learning algorithms.\",\"PeriodicalId\":188878,\"journal\":{\"name\":\"Proceedings of the 2018 1st International Conference on Internet and e-Business\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-04-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2018 1st International Conference on Internet and e-Business\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3230348.3230368\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 1st International Conference on Internet and e-Business","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3230348.3230368","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Twitter Data Classification Using Big Data Technologies
Tweets classification or in general the classification of the social network's data is a recent field of scientific research, where researchers look for new methods to classify users data (tweets, Facebook's post...) into classes (positive, negative, neutral).This type of scientific research called sentiment analysis (SA) or opinion mining and it allows to extract the feelings, opinions or attitudes expressed in a tweet or a facebook post ... In this article, we describe how we can collect and store a large volume of data, which is in the form of tweets, in Hadoop Distributed File System (HDFS), and how we can classify these tweets using different classification methods, making a comparison between the well-known machine learning algorithms and a dictionary based-approach using the AFINN dictionary. The experimental results show that the AFINN dictionary outperforms the well-known machine learning algorithms.