A Dynamic Approach for Frequent Pattern Mining Using Transposition of Database

2010 Second International Conference on Communication Software and Networks Pub Date : 2010-02-26 DOI:10.1109/ICCSN.2010.15

Sunil Joshi, R. Jain

{"title":"A Dynamic Approach for Frequent Pattern Mining Using Transposition of Database","authors":"Sunil Joshi, R. Jain","doi":"10.1109/ICCSN.2010.15","DOIUrl":null,"url":null,"abstract":"an Important Problem in Data Mining in Various Fields like Medicine, Telecommunications and World Wide Web is Discovering Patterns. Frequent patterns mining is the focused research topic in association rule analysis. Apriori algorithm is a classical algorithm of association rule mining. Lots of algorithms for mining association rules and their mutations are proposed on basis of Apriori Algorithm. Most of the previous studies adopt Apriori-like algorithms which generate-and-test candidates and improving algorithm strategy and structure but no one concentrate on the structure of database. A simple approach is if we implement in Transposed database then result is very fast. Recently, different works proposed a new way to mine patterns in transposed databases where a database with thousands of attributes but only tens of objects. In this case, mining the transposed database runs through a smaller search space. In this paper, we systematically explore the search space of frequent patterns mining and represent database in transposed form. We develop an algorithm (termed DFPMT—A Dynamic Approach for Frequent Patterns Mining Using Transposition of Database) for mining frequent patterns which are based on Apriori algorithm and used Dynamic Approach like Longest Common Subsequence. The main distinguishing factors among the proposed schemes is the database stores in transposed form and in each iteration database is filter /reduce by generating LCS of transaction id for each pattern. Our solutions provide faster result. A quantitative exploration of these tradeoffs is conducted through an extensive experimental study on synthetic and real-life data sets.","PeriodicalId":255246,"journal":{"name":"2010 Second International Conference on Communication Software and Networks","volume":"10 4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 Second International Conference on Communication Software and Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSN.2010.15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 24

Abstract

an Important Problem in Data Mining in Various Fields like Medicine, Telecommunications and World Wide Web is Discovering Patterns. Frequent patterns mining is the focused research topic in association rule analysis. Apriori algorithm is a classical algorithm of association rule mining. Lots of algorithms for mining association rules and their mutations are proposed on basis of Apriori Algorithm. Most of the previous studies adopt Apriori-like algorithms which generate-and-test candidates and improving algorithm strategy and structure but no one concentrate on the structure of database. A simple approach is if we implement in Transposed database then result is very fast. Recently, different works proposed a new way to mine patterns in transposed databases where a database with thousands of attributes but only tens of objects. In this case, mining the transposed database runs through a smaller search space. In this paper, we systematically explore the search space of frequent patterns mining and represent database in transposed form. We develop an algorithm (termed DFPMT—A Dynamic Approach for Frequent Patterns Mining Using Transposition of Database) for mining frequent patterns which are based on Apriori algorithm and used Dynamic Approach like Longest Common Subsequence. The main distinguishing factors among the proposed schemes is the database stores in transposed form and in each iteration database is filter /reduce by generating LCS of transaction id for each pattern. Our solutions provide faster result. A quantitative exploration of these tradeoffs is conducted through an extensive experimental study on synthetic and real-life data sets.

查看原文本刊更多论文

基于数据库转置的频繁模式挖掘动态方法

在医学、电信和万维网等各个领域的数据挖掘中，发现模式是一个重要的问题。频繁模式挖掘是关联规则分析中的研究热点。Apriori算法是关联规则挖掘的经典算法。在Apriori算法的基础上，提出了许多挖掘关联规则及其突变的算法。以往的研究大多采用类似apriori的算法来生成和测试候选对象，改进算法策略和结构，但没有人关注数据库的结构。一个简单的方法是，如果我们在转置数据库中实现，那么结果非常快。最近，不同的研究提出了一种新的方法来挖掘转置数据库中的模式，其中一个数据库有数千个属性，但只有几十个对象。在这种情况下，对转置数据库的挖掘需要更小的搜索空间。本文系统地探索了频繁模式挖掘的搜索空间，并用转置形式表示数据库。本文提出了一种基于Apriori算法并采用最长公共子序列等动态方法挖掘频繁模式的算法(DFPMT-A)。不同方案之间的主要区别在于数据库以转置形式存储，并且在每次迭代数据库中通过为每个模式生成事务id的LCS进行过滤/约简。我们的解决方案提供更快的结果。通过对合成和现实数据集的广泛实验研究，对这些权衡进行了定量探索。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2010 Second International Conference on Communication Software and Networks

自引率

0.00%

发文量