J. Monisha, P. Jeba, M. Bhuvaneswari, K. Muneeswaran
{"title":"Extracting usage patterns from web server log","authors":"J. Monisha, P. Jeba, M. Bhuvaneswari, K. Muneeswaran","doi":"10.1109/ICGHPC.2016.7508074","DOIUrl":null,"url":null,"abstract":"Websites are the primary medium of any organization to communicate to their customers. Navigational usability and accessibility of the website are crucial to gain competitive advantage. Understanding how the customer uses the website can provide insight into their behavior. Web server logs contain latent information about usage behavior of customers. User sessions are a sequence of pages accessed by users for a specific period. The sessions are reconstructed from the web server logs. Simulated Annealing technique is used to enhance the process of identifying sessions. Considering the non-deterministic browsing behavior, soft clustering methods are used for assigning membership value for each session to belong to a cluster. A modified form of Fuzzy C-Means is used for clustering. The framework involves access log preprocessing, user identification, session identification and Mountain density function (MDF)-based fuzzy clustering. The obtained clusters represent common navigational behavior among the users.","PeriodicalId":268630,"journal":{"name":"2016 2nd International Conference on Green High Performance Computing (ICGHPC)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 2nd International Conference on Green High Performance Computing (ICGHPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICGHPC.2016.7508074","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
Websites are the primary medium of any organization to communicate to their customers. Navigational usability and accessibility of the website are crucial to gain competitive advantage. Understanding how the customer uses the website can provide insight into their behavior. Web server logs contain latent information about usage behavior of customers. User sessions are a sequence of pages accessed by users for a specific period. The sessions are reconstructed from the web server logs. Simulated Annealing technique is used to enhance the process of identifying sessions. Considering the non-deterministic browsing behavior, soft clustering methods are used for assigning membership value for each session to belong to a cluster. A modified form of Fuzzy C-Means is used for clustering. The framework involves access log preprocessing, user identification, session identification and Mountain density function (MDF)-based fuzzy clustering. The obtained clusters represent common navigational behavior among the users.