{"title":"High speed streaming data analysis of web generated log streams","authors":"Sonali Agarwal, Bakshi Rohit Prasad","doi":"10.1109/ICIINFS.2015.7399047","DOIUrl":null,"url":null,"abstract":"Web logs provide useful insight of large scale web based applications and helpful in deriving web usage patterns. Since, web usage patterns are available at a high rate and a high volume and also continuously updating in a real time environment, must be handled through modern big data architectures supported by powerful real time big data processing tools. Web generated log streams have most significant impact when it is feasible to analyze them at a time when they are emitted. In proposed research work, an advanced stream analytics framework especially for web generated log streams has been proposed by using the dataset of web access logs representing HTTP requests received by NASA Kennedy Space Center Server. The proposed framework can resourcefully handle the challenging issues associated to manage multiple web based log streams that are distributed across a fleet of web based applications and present a summarized view of statistical profile of web based applications which may be useful for web usage mining.","PeriodicalId":174378,"journal":{"name":"2015 IEEE 10th International Conference on Industrial and Information Systems (ICIIS)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE 10th International Conference on Industrial and Information Systems (ICIIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIINFS.2015.7399047","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19
Abstract
Web logs provide useful insight of large scale web based applications and helpful in deriving web usage patterns. Since, web usage patterns are available at a high rate and a high volume and also continuously updating in a real time environment, must be handled through modern big data architectures supported by powerful real time big data processing tools. Web generated log streams have most significant impact when it is feasible to analyze them at a time when they are emitted. In proposed research work, an advanced stream analytics framework especially for web generated log streams has been proposed by using the dataset of web access logs representing HTTP requests received by NASA Kennedy Space Center Server. The proposed framework can resourcefully handle the challenging issues associated to manage multiple web based log streams that are distributed across a fleet of web based applications and present a summarized view of statistical profile of web based applications which may be useful for web usage mining.