Sampling Based N-Hash Algorithm for Searching Frequent Itemset

2010 International Conference on Internet Technology and Applications Pub Date : 2010-09-09 DOI:10.1109/ITAPP.2010.5566076

Yong-ming Chen, Mei-ling Zhu

引用次数: 3

Abstract

Searching frequent itemsets is the critical problem in generating association rules in data mining, classic Hash-based technique, put forward by J. S. Park, for searching frequent itemsets has two shortcomings: one is that it is difficult to choose an appropriate hash function; the other is that it is liable to cause hash colliding. In order to solve the two problems, Chen Y.M. proposed N-Hash algorithm which needn't to choose hash function and avoided hash colliding. In this paper, the sampling technique is employed to improve the efficiency of N-Hash algorithm.

查看原文本刊更多论文

基于采样的频繁项集搜索n -哈希算法

频繁项集的搜索是数据挖掘中关联规则生成的关键问题，J. S. Park提出的经典哈希技术用于频繁项集的搜索存在两个缺点:一是难以选择合适的哈希函数;另一个是它容易导致哈希碰撞。为了解决这两个问题，陈彦明提出了N-Hash算法，该算法不需要选择哈希函数，避免了哈希碰撞。本文采用采样技术来提高n -哈希算法的效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2010 International Conference on Internet Technology and Applications

自引率

0.00%

发文量