{"title":"Classification and Summarization for Informative Tweets","authors":"S. Roy, Sumit Mishra, Rakesh Matam","doi":"10.1109/SCEECS48394.2020.128","DOIUrl":null,"url":null,"abstract":"Microblogging websites like twitter, facebook, etc. has become a substantive platform for the people to publicize their feelings, requirements, etc. It allows users to post short messages for their online audience. These messages are the fusion of blogging and minute messaging, consisting of images, videos, or voice notes. We have primarily focused on information provided by microblogging sites for achieving real-time informational data. Microblogging websites are widely used around the globe by people for portraying what has been happening around their normal living. So, data through these sites eventually helps us getting non-manipulated data directly from the user. In this paper, a disaster dataset (Fani Cyclone dataset) is considered, which consists of the tweets related to a Cyclone named \"Fani\". The tweets are pre-processed and then classified into two categories – informative and non-informative. We have been able to achieve a classification accuracy of 74:268% when pre-processed data is being considered. As we are dealing with disaster dataset, so in the end, we have summarized the informative tweets for the concerned authorities, which would help them to have an overview of the data.","PeriodicalId":167175,"journal":{"name":"2020 IEEE International Students' Conference on Electrical,Electronics and Computer Science (SCEECS)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Students' Conference on Electrical,Electronics and Computer Science (SCEECS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCEECS48394.2020.128","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
Microblogging websites like twitter, facebook, etc. has become a substantive platform for the people to publicize their feelings, requirements, etc. It allows users to post short messages for their online audience. These messages are the fusion of blogging and minute messaging, consisting of images, videos, or voice notes. We have primarily focused on information provided by microblogging sites for achieving real-time informational data. Microblogging websites are widely used around the globe by people for portraying what has been happening around their normal living. So, data through these sites eventually helps us getting non-manipulated data directly from the user. In this paper, a disaster dataset (Fani Cyclone dataset) is considered, which consists of the tweets related to a Cyclone named "Fani". The tweets are pre-processed and then classified into two categories – informative and non-informative. We have been able to achieve a classification accuracy of 74:268% when pre-processed data is being considered. As we are dealing with disaster dataset, so in the end, we have summarized the informative tweets for the concerned authorities, which would help them to have an overview of the data.