{"title":"基于Apriori的大数据频繁项集挖掘方法综述","authors":"Mohammad Javad Shayegan Fard, Parsa Asgari Namin","doi":"10.1109/ICWR49608.2020.9122295","DOIUrl":null,"url":null,"abstract":"The data being generated today is massive in terms of volume, velocity, and variety. It is a great challenge to derive knowledge from data in this condition. Researchers, therefore have proposed ways to deal with this challenge. frequent itemset mining is one of the proposed ways to distinguish itemsets inside the vast amount of data to aid the operation of a variety of pursuits and businesses, a process termed ‘Association Rule Mining'. However, there are a variety of works done in this area. The introduction of different algorithms, frameworks, and applications throughout the recent decade has produced many interesting approaches. One of the algorithms in this area is the Apriori algorithm. It is a simple yet powerful algorithm. However, the original Apriori is not suitable for big data and due to this reason, researchers have attempted to introduce ways and schemes to adapt it to this new age of data. Because of the number of efforts in this area, having a bird's eye view of the past works is of value. This review aims to present an insight into the works done in the intersection of two matters: big data and the Apriori algorithm. It is concerned with Aprioribased algorithms presented in the recent decade with a focus on the three popular big data platforms: Apache Hadoop, Spark, and Flink. Also, major points of each approach and solution is presented. A conclusion in the end summarizes the points discussed in this paper.","PeriodicalId":231982,"journal":{"name":"2020 6th International Conference on Web Research (ICWR)","volume":"02 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Review of Apriori based Frequent Itemset Mining Solutions on Big Data\",\"authors\":\"Mohammad Javad Shayegan Fard, Parsa Asgari Namin\",\"doi\":\"10.1109/ICWR49608.2020.9122295\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The data being generated today is massive in terms of volume, velocity, and variety. It is a great challenge to derive knowledge from data in this condition. Researchers, therefore have proposed ways to deal with this challenge. frequent itemset mining is one of the proposed ways to distinguish itemsets inside the vast amount of data to aid the operation of a variety of pursuits and businesses, a process termed ‘Association Rule Mining'. However, there are a variety of works done in this area. The introduction of different algorithms, frameworks, and applications throughout the recent decade has produced many interesting approaches. One of the algorithms in this area is the Apriori algorithm. It is a simple yet powerful algorithm. However, the original Apriori is not suitable for big data and due to this reason, researchers have attempted to introduce ways and schemes to adapt it to this new age of data. Because of the number of efforts in this area, having a bird's eye view of the past works is of value. This review aims to present an insight into the works done in the intersection of two matters: big data and the Apriori algorithm. It is concerned with Aprioribased algorithms presented in the recent decade with a focus on the three popular big data platforms: Apache Hadoop, Spark, and Flink. Also, major points of each approach and solution is presented. A conclusion in the end summarizes the points discussed in this paper.\",\"PeriodicalId\":231982,\"journal\":{\"name\":\"2020 6th International Conference on Web Research (ICWR)\",\"volume\":\"02 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 6th International Conference on Web Research (ICWR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICWR49608.2020.9122295\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 6th International Conference on Web Research (ICWR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICWR49608.2020.9122295","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Review of Apriori based Frequent Itemset Mining Solutions on Big Data
The data being generated today is massive in terms of volume, velocity, and variety. It is a great challenge to derive knowledge from data in this condition. Researchers, therefore have proposed ways to deal with this challenge. frequent itemset mining is one of the proposed ways to distinguish itemsets inside the vast amount of data to aid the operation of a variety of pursuits and businesses, a process termed ‘Association Rule Mining'. However, there are a variety of works done in this area. The introduction of different algorithms, frameworks, and applications throughout the recent decade has produced many interesting approaches. One of the algorithms in this area is the Apriori algorithm. It is a simple yet powerful algorithm. However, the original Apriori is not suitable for big data and due to this reason, researchers have attempted to introduce ways and schemes to adapt it to this new age of data. Because of the number of efforts in this area, having a bird's eye view of the past works is of value. This review aims to present an insight into the works done in the intersection of two matters: big data and the Apriori algorithm. It is concerned with Aprioribased algorithms presented in the recent decade with a focus on the three popular big data platforms: Apache Hadoop, Spark, and Flink. Also, major points of each approach and solution is presented. A conclusion in the end summarizes the points discussed in this paper.