根据引用长度计算用户会话标识的截止时间

J. Kapusta, M. Munk, Martin Drlík
{"title":"根据引用长度计算用户会话标识的截止时间","authors":"J. Kapusta, M. Munk, Martin Drlík","doi":"10.1109/ICAICT.2012.6398500","DOIUrl":null,"url":null,"abstract":"One of the methods of web log mining is also discovering patterns of behavior of web site visitors. Based on the found users' behavior patterns that are represented by sequence rules, it is possible to modify and improve web site of the organization. Data for the analysis are gained from the web server log file. These anonymous data represent the problem of unique identification of the web site visitor. The paper deals with less commonly used navigation-driven methods of user session identification. These methods assume that the user goes over several navigation pages during her/his visit until she/he finds the content page with required information. The content page is a page where the user spends considerably more time in comparison with navigation pages. The content page is considered to be the end of the session. Searching of the next content page using navigation pages constitutes a new user session. The division of pages into content and navigation pages is based on the calculation of cut-off time C. The verification of exponential distribution of variable that represents the time which user spent on the particular page is coessential. We prepared an experiment with data gained from log file of university web server. We tried to verify, if the time spent on web pages has exponential distribution and we estimated the value of cut-off time. The found results confirm our assumptions that the navigation oriented methods could be used to proper user session identification.","PeriodicalId":221511,"journal":{"name":"2012 6th International Conference on Application of Information and Communication Technologies (AICT)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":"{\"title\":\"Cut-off time calculation for user session identification by reference length\",\"authors\":\"J. Kapusta, M. Munk, Martin Drlík\",\"doi\":\"10.1109/ICAICT.2012.6398500\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One of the methods of web log mining is also discovering patterns of behavior of web site visitors. Based on the found users' behavior patterns that are represented by sequence rules, it is possible to modify and improve web site of the organization. Data for the analysis are gained from the web server log file. These anonymous data represent the problem of unique identification of the web site visitor. The paper deals with less commonly used navigation-driven methods of user session identification. These methods assume that the user goes over several navigation pages during her/his visit until she/he finds the content page with required information. The content page is a page where the user spends considerably more time in comparison with navigation pages. The content page is considered to be the end of the session. Searching of the next content page using navigation pages constitutes a new user session. The division of pages into content and navigation pages is based on the calculation of cut-off time C. The verification of exponential distribution of variable that represents the time which user spent on the particular page is coessential. We prepared an experiment with data gained from log file of university web server. We tried to verify, if the time spent on web pages has exponential distribution and we estimated the value of cut-off time. The found results confirm our assumptions that the navigation oriented methods could be used to proper user session identification.\",\"PeriodicalId\":221511,\"journal\":{\"name\":\"2012 6th International Conference on Application of Information and Communication Technologies (AICT)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-12-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"21\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 6th International Conference on Application of Information and Communication Technologies (AICT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAICT.2012.6398500\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 6th International Conference on Application of Information and Communication Technologies (AICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAICT.2012.6398500","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 21

摘要

网络日志挖掘的方法之一也是发现网站访问者的行为模式。根据发现的用户行为模式,用序列规则表示用户的行为模式,可以对组织的网站进行修改和改进。用于分析的数据来自web服务器日志文件。这些匿名数据代表了网站访问者唯一识别的问题。本文讨论了不太常用的导航驱动的用户会话识别方法。这些方法假设用户在访问期间浏览了多个导航页面,直到找到包含所需信息的内容页面。与导航页相比,内容页是用户花费更多时间的页面。内容页被认为是会话的结束。使用导航页搜索下一个内容页构成一个新的用户会话。将页面划分为内容页面和导航页面是基于截断时间c的计算。验证表示用户在特定页面上花费的时间的变量的指数分布是同质的。我们利用大学web服务器日志文件中的数据进行了实验。我们试图验证在网页上花费的时间是否呈指数分布,并估计截止时间的值。发现的结果证实了我们的假设,即面向导航的方法可以用于适当的用户会话识别。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Cut-off time calculation for user session identification by reference length
One of the methods of web log mining is also discovering patterns of behavior of web site visitors. Based on the found users' behavior patterns that are represented by sequence rules, it is possible to modify and improve web site of the organization. Data for the analysis are gained from the web server log file. These anonymous data represent the problem of unique identification of the web site visitor. The paper deals with less commonly used navigation-driven methods of user session identification. These methods assume that the user goes over several navigation pages during her/his visit until she/he finds the content page with required information. The content page is a page where the user spends considerably more time in comparison with navigation pages. The content page is considered to be the end of the session. Searching of the next content page using navigation pages constitutes a new user session. The division of pages into content and navigation pages is based on the calculation of cut-off time C. The verification of exponential distribution of variable that represents the time which user spent on the particular page is coessential. We prepared an experiment with data gained from log file of university web server. We tried to verify, if the time spent on web pages has exponential distribution and we estimated the value of cut-off time. The found results confirm our assumptions that the navigation oriented methods could be used to proper user session identification.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信