{"title":"Relational clustering based on a new robust estimator with application to Web mining","authors":"O. Nasraoui, R. Krishnapuram, A. Joshi","doi":"10.1109/NAFIPS.1999.781785","DOIUrl":null,"url":null,"abstract":"Mining typical user profiles and URL associations from the vast amount of access logs is an important component of Web personalization. In this paper, we define the notion of a \"\"user session\" as being a temporally compact sequence of Web accesses by a user. We also define a dissimilarity measure between two Web sessions that captures the organization of a Web site. To cluster the user sessions based on the pairwise dissimilarities, we introduce the relational fuzzy c-maximal density estimator (RFC-MDE) algorithm. RFC-MDE is robust and can deal with outliers that are typical in this application. We show real examples of the use of RFC-MDE for extraction of user profiles from log data, and and compare its performance to the standard non-Euclidean fuzzy c-means.","PeriodicalId":335957,"journal":{"name":"18th International Conference of the North American Fuzzy Information Processing Society - NAFIPS (Cat. No.99TH8397)","volume":"47 5","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"57","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"18th International Conference of the North American Fuzzy Information Processing Society - NAFIPS (Cat. No.99TH8397)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NAFIPS.1999.781785","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 57
Abstract
Mining typical user profiles and URL associations from the vast amount of access logs is an important component of Web personalization. In this paper, we define the notion of a ""user session" as being a temporally compact sequence of Web accesses by a user. We also define a dissimilarity measure between two Web sessions that captures the organization of a Web site. To cluster the user sessions based on the pairwise dissimilarities, we introduce the relational fuzzy c-maximal density estimator (RFC-MDE) algorithm. RFC-MDE is robust and can deal with outliers that are typical in this application. We show real examples of the use of RFC-MDE for extraction of user profiles from log data, and and compare its performance to the standard non-Euclidean fuzzy c-means.