并行和分布式关联规则挖掘算法:最近的调查

Sudarsan Biswas, Neepa Biswas, K. Mondal
{"title":"并行和分布式关联规则挖掘算法:最近的调查","authors":"Sudarsan Biswas, Neepa Biswas, K. Mondal","doi":"10.26480/imcs.01.2019.15.24","DOIUrl":null,"url":null,"abstract":"Data investigation is an essential key factor now a days due to rapidly growing electronic technology. It generates a large number of transactional data logs from a range of sources devices. Parallel and distributed computing is a useful approach for enhancing the data mining process. The aim of this research is to present a systematic review of parallel association rule mining (PARM) and distributed association rule mining (DARM) approaches. We have observed that the parallelized nature of Apriori, Equivalence class, Hadoop (MapReduce), and Spark proves to be very efficient in PARM and DARM environment. We conclude that this comprehensive review, references cited in this article will convey foremost hypothetical issues and a guideline to the researcher an interesting research direction. The most important hypothetical issue and challenges include the large size of databases, dimensionality of data, indexing schemes of data in the database, data skewness, database location, load balancing strategies, methods of adaptability in incremental databases and orientation of the database.","PeriodicalId":292564,"journal":{"name":"INFORMATION MANAGEMENT AND COMPUTER SCIENCE","volume":"77 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"PARALLEL AND DISTRIBUTED ASSOCIATION RULE MINING ALGORITHMS: A RECENT SURVEY\",\"authors\":\"Sudarsan Biswas, Neepa Biswas, K. Mondal\",\"doi\":\"10.26480/imcs.01.2019.15.24\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data investigation is an essential key factor now a days due to rapidly growing electronic technology. It generates a large number of transactional data logs from a range of sources devices. Parallel and distributed computing is a useful approach for enhancing the data mining process. The aim of this research is to present a systematic review of parallel association rule mining (PARM) and distributed association rule mining (DARM) approaches. We have observed that the parallelized nature of Apriori, Equivalence class, Hadoop (MapReduce), and Spark proves to be very efficient in PARM and DARM environment. We conclude that this comprehensive review, references cited in this article will convey foremost hypothetical issues and a guideline to the researcher an interesting research direction. The most important hypothetical issue and challenges include the large size of databases, dimensionality of data, indexing schemes of data in the database, data skewness, database location, load balancing strategies, methods of adaptability in incremental databases and orientation of the database.\",\"PeriodicalId\":292564,\"journal\":{\"name\":\"INFORMATION MANAGEMENT AND COMPUTER SCIENCE\",\"volume\":\"77 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"INFORMATION MANAGEMENT AND COMPUTER SCIENCE\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.26480/imcs.01.2019.15.24\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"INFORMATION MANAGEMENT AND COMPUTER SCIENCE","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.26480/imcs.01.2019.15.24","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

在电子技术飞速发展的今天,数据调查是必不可少的关键因素。它从一系列源设备生成大量事务性数据日志。并行和分布式计算是增强数据挖掘过程的一种有效方法。本研究的目的是对并行关联规则挖掘(PARM)和分布式关联规则挖掘(DARM)方法进行系统的回顾。我们已经观察到,Apriori、Equivalence类、Hadoop (MapReduce)和Spark的并行化特性在PARM和DARM环境中被证明是非常有效的。我们的结论是,这篇全面的综述,在文章中引用的参考文献将传达最重要的假设问题和指导方针,研究者一个有趣的研究方向。最重要的假设问题和挑战包括数据库的大规模、数据的维数、数据库中数据的索引方案、数据偏度、数据库位置、负载平衡策略、增量数据库的适应性方法和数据库的方向。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
PARALLEL AND DISTRIBUTED ASSOCIATION RULE MINING ALGORITHMS: A RECENT SURVEY
Data investigation is an essential key factor now a days due to rapidly growing electronic technology. It generates a large number of transactional data logs from a range of sources devices. Parallel and distributed computing is a useful approach for enhancing the data mining process. The aim of this research is to present a systematic review of parallel association rule mining (PARM) and distributed association rule mining (DARM) approaches. We have observed that the parallelized nature of Apriori, Equivalence class, Hadoop (MapReduce), and Spark proves to be very efficient in PARM and DARM environment. We conclude that this comprehensive review, references cited in this article will convey foremost hypothetical issues and a guideline to the researcher an interesting research direction. The most important hypothetical issue and challenges include the large size of databases, dimensionality of data, indexing schemes of data in the database, data skewness, database location, load balancing strategies, methods of adaptability in incremental databases and orientation of the database.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信