An Efficient Method for Frequent Itemset Mining on Temporal Data

K FathimaSherinT, B. A. Kumar
{"title":"An Efficient Method for Frequent Itemset Mining on Temporal Data","authors":"K FathimaSherinT, B. A. Kumar","doi":"10.32628/CSEIT1953162","DOIUrl":null,"url":null,"abstract":"Frequent itemset mining (FIM) is a data mining idea with extracting frequent itemset from a database. Finding frequent itemsets in existing methods accept that datasets are static or steady and enlisted guidelines are pertinent all through the total dataset. In any case, this isn't the situation when information is temporal which contains time-related data that changes data mining results. Patterns may occur during all or at specific interims, to limit time interims, frequent itemset mining with time cube is proposed to manage time arranges in the mining technique. This is how patterns are perceived that happen occasionally, in a period interim, or both. Thus, this paper mostly centres around developing up a productive calculation to mine frequent itemsets and their related time interval from a value-based database by expanding from the earlier calculation dependent on support and density as another edge. Density is proposed to deal with the overestimated timespan issue and to ensure the authenticity of the patterns found. As an extension from the current framework, here the density rate and minimum threshold is dynamically generated which is user determined parameter previously. Likewise, an analysis concerning time is made between dataset with partitioning and without apportioning the dataset, which shows computation time is less on account of partitioning technique.","PeriodicalId":313456,"journal":{"name":"International Journal of Scientific Research in Computer Science, Engineering and Information Technology","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Scientific Research in Computer Science, Engineering and Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32628/CSEIT1953162","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Frequent itemset mining (FIM) is a data mining idea with extracting frequent itemset from a database. Finding frequent itemsets in existing methods accept that datasets are static or steady and enlisted guidelines are pertinent all through the total dataset. In any case, this isn't the situation when information is temporal which contains time-related data that changes data mining results. Patterns may occur during all or at specific interims, to limit time interims, frequent itemset mining with time cube is proposed to manage time arranges in the mining technique. This is how patterns are perceived that happen occasionally, in a period interim, or both. Thus, this paper mostly centres around developing up a productive calculation to mine frequent itemsets and their related time interval from a value-based database by expanding from the earlier calculation dependent on support and density as another edge. Density is proposed to deal with the overestimated timespan issue and to ensure the authenticity of the patterns found. As an extension from the current framework, here the density rate and minimum threshold is dynamically generated which is user determined parameter previously. Likewise, an analysis concerning time is made between dataset with partitioning and without apportioning the dataset, which shows computation time is less on account of partitioning technique.
一种有效的时态数据频繁项集挖掘方法
频繁项集挖掘(FIM)是一种从数据库中提取频繁项集的数据挖掘思想。在现有方法中寻找频繁项集,接受数据集是静态的或稳定的,并且在整个数据集中都有相关的指南。在任何情况下,当信息是时态的,其中包含改变数据挖掘结果的与时间相关的数据时,情况就不是这样了。模式可能出现在所有时间或特定时间,为了限制时间间隔,提出了基于时间立方体的频繁项集挖掘技术来管理挖掘技术中的时间安排。这就是如何感知偶尔发生的模式,在一段时间内,或两者兼而有之。因此,本文主要围绕开发一种富有成效的计算来挖掘频繁项目集及其相关的时间间隔,从基于值的数据库中扩展依赖于支持度和密度的早期计算作为另一个边缘。提出密度是为了解决时间跨度估计过高的问题,并确保所发现模式的真实性。作为现有框架的扩展,这里的密度率和最小阈值是动态生成的,之前是用户确定的参数。同样,对分区和不分区数据集的时间进行了分析,表明分区技术减少了计算时间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信