{"title":"Fire Emergency Detection from Twitter Using Supervised Principal","authors":"Mohammed Ahsan Raza Noori, Ritika Mehra","doi":"10.1109/ICIIS51140.2020.9342671","DOIUrl":null,"url":null,"abstract":"Principal Component Analysis (PCA) is primarily a dimensionality reduction technique used in the area of unsupervised machine learning, while the use of PCA in the area of supervised machine learning is still in progress. In the field of supervised event detection from social media, PCA is not well explored by the researchers to avoid the curse of high dimensionality produced by the Vector Space Model (VSM). In this work, we proposed a supervised event detection system, which detect the occurrence of fire emergency from Twitter streaming data in near real-time using supervised PCA as a dimensional reduction technique. Our aim is to find the minimum number of Principal Components (PC’s) that can contribute towards achieving the highest classification performance. We used three machine learning algorithms for classification, Logistic Regression (LR), Support Vector Machine (SVM) and Decision Tree (DT). The performance of these algorithms in conjunction with their corresponding PC’s has been compared. Our experimental study has shown that LR outperforms the other two algorithms and achieves the highest accuracy of 91% using 710 PC’s out of 1,000 dimensions. From the results, LR as a classifier is used to build the actual system. To process high dimensional data in batch as well as in near real-time we used Apache Spark framework.","PeriodicalId":352858,"journal":{"name":"2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIIS51140.2020.9342671","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Principal Component Analysis (PCA) is primarily a dimensionality reduction technique used in the area of unsupervised machine learning, while the use of PCA in the area of supervised machine learning is still in progress. In the field of supervised event detection from social media, PCA is not well explored by the researchers to avoid the curse of high dimensionality produced by the Vector Space Model (VSM). In this work, we proposed a supervised event detection system, which detect the occurrence of fire emergency from Twitter streaming data in near real-time using supervised PCA as a dimensional reduction technique. Our aim is to find the minimum number of Principal Components (PC’s) that can contribute towards achieving the highest classification performance. We used three machine learning algorithms for classification, Logistic Regression (LR), Support Vector Machine (SVM) and Decision Tree (DT). The performance of these algorithms in conjunction with their corresponding PC’s has been compared. Our experimental study has shown that LR outperforms the other two algorithms and achieves the highest accuracy of 91% using 710 PC’s out of 1,000 dimensions. From the results, LR as a classifier is used to build the actual system. To process high dimensional data in batch as well as in near real-time we used Apache Spark framework.