{"title":"Analyzing Cilck Stream Data Using Hadoop","authors":"Pulkit Sharma, Komal Mahajan, Vishal Bhatnagar","doi":"10.1109/CICT.2016.28","DOIUrl":null,"url":null,"abstract":"Big data is a voluminous and complex collection of data that is difficult to process using the day to day database management techniques and traditional data processing tools. Analysis of such data by organizations can discover previously unknown trends and opportunities which can be used for optimization of services and even help executives in decision making. One such source of data is clickstream data which can be captured and analyzed by online retailers and similar conglomerates to optimize the websites and increase sales. In today's world, online retail has become a huge industry which boasts of a huge number of retailers, over a billion customers and worldwide sales of over 22 trillion U. S. D., a number which is set to increase in coming years. This brings with it the need to provide a great browsing experience to customers to keep afloat in the huge market of online retailers. Hadoop is the easy to use framework that helps the user in processing of large data sets over a cluster of commodity computers using simple programming techniques.","PeriodicalId":118509,"journal":{"name":"2016 Second International Conference on Computational Intelligence & Communication Technology (CICT)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Second International Conference on Computational Intelligence & Communication Technology (CICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CICT.2016.28","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Big data is a voluminous and complex collection of data that is difficult to process using the day to day database management techniques and traditional data processing tools. Analysis of such data by organizations can discover previously unknown trends and opportunities which can be used for optimization of services and even help executives in decision making. One such source of data is clickstream data which can be captured and analyzed by online retailers and similar conglomerates to optimize the websites and increase sales. In today's world, online retail has become a huge industry which boasts of a huge number of retailers, over a billion customers and worldwide sales of over 22 trillion U. S. D., a number which is set to increase in coming years. This brings with it the need to provide a great browsing experience to customers to keep afloat in the huge market of online retailers. Hadoop is the easy to use framework that helps the user in processing of large data sets over a cluster of commodity computers using simple programming techniques.