{"title":"一种支持交互式web日志挖掘的新型数据立方体模型","authors":"Tadashi Ohmori, Yuichi Tsutatani, M. Hoshi","doi":"10.1109/CW.2002.1180909","DOIUrl":null,"url":null,"abstract":"Web-log mining is a technique to find \"useful\" information from access-log data. Typically, association rule mining is used to find frequent patterns (or sequence patterns) of visited pages from access logs and to build users' behavior models from those patterns. In this direction, there exists a difficulty that a human decision-maker must do such data mining process many times under different constraining conditions, different groups of pages, and different levels of abstraction. In order to support this process, this paper proposes a novel datacube model called itemset cube. This cube manages frequent itemsets under various conditions which are modeled by a n-dimensional space. An itemset cube is materialized, sliced, and rolled-up repeatedly in the same way as a traditional scalar datacube is done for interactive scalar-value analysis. Although this looks simple, fast execution of these operations on an itemset cube is difficult. It is because different cells in an itemset cube contain different numbers of records, but these cells must use the same threshold ratios in order to detect frequent itemsets of equal quality. In this paper, a datacube model for storing frequent itemsets is described, and then an efficient algorithm of associated operations is proposed. Its application to a real-life dataset is also demonstrated.","PeriodicalId":376322,"journal":{"name":"First International Symposium on Cyber Worlds, 2002. Proceedings.","volume":"116 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A novel datacube model supporting interactive web-log mining\",\"authors\":\"Tadashi Ohmori, Yuichi Tsutatani, M. Hoshi\",\"doi\":\"10.1109/CW.2002.1180909\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Web-log mining is a technique to find \\\"useful\\\" information from access-log data. Typically, association rule mining is used to find frequent patterns (or sequence patterns) of visited pages from access logs and to build users' behavior models from those patterns. In this direction, there exists a difficulty that a human decision-maker must do such data mining process many times under different constraining conditions, different groups of pages, and different levels of abstraction. In order to support this process, this paper proposes a novel datacube model called itemset cube. This cube manages frequent itemsets under various conditions which are modeled by a n-dimensional space. An itemset cube is materialized, sliced, and rolled-up repeatedly in the same way as a traditional scalar datacube is done for interactive scalar-value analysis. Although this looks simple, fast execution of these operations on an itemset cube is difficult. It is because different cells in an itemset cube contain different numbers of records, but these cells must use the same threshold ratios in order to detect frequent itemsets of equal quality. In this paper, a datacube model for storing frequent itemsets is described, and then an efficient algorithm of associated operations is proposed. Its application to a real-life dataset is also demonstrated.\",\"PeriodicalId\":376322,\"journal\":{\"name\":\"First International Symposium on Cyber Worlds, 2002. Proceedings.\",\"volume\":\"116 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"First International Symposium on Cyber Worlds, 2002. Proceedings.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CW.2002.1180909\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"First International Symposium on Cyber Worlds, 2002. Proceedings.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CW.2002.1180909","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A novel datacube model supporting interactive web-log mining
Web-log mining is a technique to find "useful" information from access-log data. Typically, association rule mining is used to find frequent patterns (or sequence patterns) of visited pages from access logs and to build users' behavior models from those patterns. In this direction, there exists a difficulty that a human decision-maker must do such data mining process many times under different constraining conditions, different groups of pages, and different levels of abstraction. In order to support this process, this paper proposes a novel datacube model called itemset cube. This cube manages frequent itemsets under various conditions which are modeled by a n-dimensional space. An itemset cube is materialized, sliced, and rolled-up repeatedly in the same way as a traditional scalar datacube is done for interactive scalar-value analysis. Although this looks simple, fast execution of these operations on an itemset cube is difficult. It is because different cells in an itemset cube contain different numbers of records, but these cells must use the same threshold ratios in order to detect frequent itemsets of equal quality. In this paper, a datacube model for storing frequent itemsets is described, and then an efficient algorithm of associated operations is proposed. Its application to a real-life dataset is also demonstrated.