{"title":"A review paper on data preprocessing: A critical phase in web usage mining process","authors":"S. Dwivedi, Bhupesh Rawat","doi":"10.1109/ICGCIOT.2015.7380517","DOIUrl":null,"url":null,"abstract":"Web usage mining refers to the process of discovering user access patterns from the log of website. Usually the web log contains unstructured, noisy and irrelevant data. To make this data suitable for pattern mining and pattern analysis it has to be passed through data preprocessing phase. Data preprocessing not only improves the quality of data but it also reduces the size of web log file. Data preprocessing involves several steps including data collection, data cleaning, session identification, user identification and path completion. This paper presents several data preprocessing techniques in order to prepare raw data suitable for mining and analysis tasks.","PeriodicalId":400178,"journal":{"name":"2015 International Conference on Green Computing and Internet of Things (ICGCIoT)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Green Computing and Internet of Things (ICGCIoT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICGCIOT.2015.7380517","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 28
Abstract
Web usage mining refers to the process of discovering user access patterns from the log of website. Usually the web log contains unstructured, noisy and irrelevant data. To make this data suitable for pattern mining and pattern analysis it has to be passed through data preprocessing phase. Data preprocessing not only improves the quality of data but it also reduces the size of web log file. Data preprocessing involves several steps including data collection, data cleaning, session identification, user identification and path completion. This paper presents several data preprocessing techniques in order to prepare raw data suitable for mining and analysis tasks.