{"title":"A novel datacube model supporting interactive web-log mining","authors":"Tadashi Ohmori, Yuichi Tsutatani, M. Hoshi","doi":"10.1109/CW.2002.1180909","DOIUrl":null,"url":null,"abstract":"Web-log mining is a technique to find \"useful\" information from access-log data. Typically, association rule mining is used to find frequent patterns (or sequence patterns) of visited pages from access logs and to build users' behavior models from those patterns. In this direction, there exists a difficulty that a human decision-maker must do such data mining process many times under different constraining conditions, different groups of pages, and different levels of abstraction. In order to support this process, this paper proposes a novel datacube model called itemset cube. This cube manages frequent itemsets under various conditions which are modeled by a n-dimensional space. An itemset cube is materialized, sliced, and rolled-up repeatedly in the same way as a traditional scalar datacube is done for interactive scalar-value analysis. Although this looks simple, fast execution of these operations on an itemset cube is difficult. It is because different cells in an itemset cube contain different numbers of records, but these cells must use the same threshold ratios in order to detect frequent itemsets of equal quality. In this paper, a datacube model for storing frequent itemsets is described, and then an efficient algorithm of associated operations is proposed. Its application to a real-life dataset is also demonstrated.","PeriodicalId":376322,"journal":{"name":"First International Symposium on Cyber Worlds, 2002. Proceedings.","volume":"116 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"First International Symposium on Cyber Worlds, 2002. Proceedings.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CW.2002.1180909","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Web-log mining is a technique to find "useful" information from access-log data. Typically, association rule mining is used to find frequent patterns (or sequence patterns) of visited pages from access logs and to build users' behavior models from those patterns. In this direction, there exists a difficulty that a human decision-maker must do such data mining process many times under different constraining conditions, different groups of pages, and different levels of abstraction. In order to support this process, this paper proposes a novel datacube model called itemset cube. This cube manages frequent itemsets under various conditions which are modeled by a n-dimensional space. An itemset cube is materialized, sliced, and rolled-up repeatedly in the same way as a traditional scalar datacube is done for interactive scalar-value analysis. Although this looks simple, fast execution of these operations on an itemset cube is difficult. It is because different cells in an itemset cube contain different numbers of records, but these cells must use the same threshold ratios in order to detect frequent itemsets of equal quality. In this paper, a datacube model for storing frequent itemsets is described, and then an efficient algorithm of associated operations is proposed. Its application to a real-life dataset is also demonstrated.