D. Koutsoukos, Georgios Alexandridis, Georgios Siolas, A. Stafylopatis
{"title":"A new approach to session identification by applying fuzzy c-means clustering on web logs","authors":"D. Koutsoukos, Georgios Alexandridis, Georgios Siolas, A. Stafylopatis","doi":"10.1109/SSCI.2016.7849939","DOIUrl":null,"url":null,"abstract":"In this paper a new algorithm for session identification in web logs is outlined, based on the fuzzy c-means clustering of the available data. The novelty of the proposed methodology lies in the initialization of the partition matrix using subtractive clustering, the examination of the effect a variety of distance metrics have on the clustering process (in addition to the widely-used Euclidean distance), the determination of the number of user sessions based on candidate sessions and the representation of the session data. The experimental results show that the proposed methodology is effective in the reconstruction of user sessions and can distinguish individual sessions more accurately than baseline time-heuristic methods proposed in literature.","PeriodicalId":120288,"journal":{"name":"2016 IEEE Symposium Series on Computational Intelligence (SSCI)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Symposium Series on Computational Intelligence (SSCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SSCI.2016.7849939","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
In this paper a new algorithm for session identification in web logs is outlined, based on the fuzzy c-means clustering of the available data. The novelty of the proposed methodology lies in the initialization of the partition matrix using subtractive clustering, the examination of the effect a variety of distance metrics have on the clustering process (in addition to the widely-used Euclidean distance), the determination of the number of user sessions based on candidate sessions and the representation of the session data. The experimental results show that the proposed methodology is effective in the reconstruction of user sessions and can distinguish individual sessions more accurately than baseline time-heuristic methods proposed in literature.