{"title":"Analysis of Big-Data Based Data Mining Engine","authors":"Xinxin Huang, Shu Gong","doi":"10.1109/CIS.2017.00043","DOIUrl":null,"url":null,"abstract":"In order to solve the problem of data mining in big data, this paper studies the data mining engine based on big data. Using Spark as the engine core and programming model, some parallel data mining algorithms are designed and implemented, and an efficient data mining engine system is built. Therefore, the traditional data mining algorithms can run in parallel in the cluster environment, in which big data can be made better of use. Through the above work, a complete big data mining system is realized, which provides an efficient and easy-to-use tool for the implementation of data mining algorithms on big data sets.","PeriodicalId":304958,"journal":{"name":"2017 13th International Conference on Computational Intelligence and Security (CIS)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 13th International Conference on Computational Intelligence and Security (CIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIS.2017.00043","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In order to solve the problem of data mining in big data, this paper studies the data mining engine based on big data. Using Spark as the engine core and programming model, some parallel data mining algorithms are designed and implemented, and an efficient data mining engine system is built. Therefore, the traditional data mining algorithms can run in parallel in the cluster environment, in which big data can be made better of use. Through the above work, a complete big data mining system is realized, which provides an efficient and easy-to-use tool for the implementation of data mining algorithms on big data sets.