多核共享内存系统频繁模式挖掘的一种新的并行方法

International Symposium on Design and Implementation of Symbolic Computation Systems Pub Date : 2013-11-18 DOI:10.1145/2534645.2534653

Lan Vu, G. Alaghband

{"title":"多核共享内存系统频繁模式挖掘的一种新的并行方法","authors":"Lan Vu, G. Alaghband","doi":"10.1145/2534645.2534653","DOIUrl":null,"url":null,"abstract":"Frequent pattern mining is an important problem in data mining with many practical applications. Current parallel methods for mining frequent patterns unstably perform for different database types and under-utilize the benefits of multi-core shared memory machines. We present ShaFEM, a novel parallel frequent pattern mining method, to address these issues. Our method can dynamically adapt to the data characteristics to efficiently perform on both sparse and dense databases. Its parallel mining lock free approach minimizes the synchronization needs and maximizes the data independence to enhance the scalability. Its structure lends itself well for dynamic job scheduling resulting in well-balanced load on new multi-core shared memory architectures. We evaluate ShaFEM on a 12-core multi-socket server and find that our method runs 2.1--5.8 times faster than the state-of-the-art parallel method. For some test cases, we have shown that ShaFEM saves 4.9 days and 12.8 hours of execution time over the compared method.","PeriodicalId":166804,"journal":{"name":"International Symposium on Design and Implementation of Symbolic Computation Systems","volume":"160 49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"Novel parallel method for mining frequent patterns on multi-core shared memory systems\",\"authors\":\"Lan Vu, G. Alaghband\",\"doi\":\"10.1145/2534645.2534653\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Frequent pattern mining is an important problem in data mining with many practical applications. Current parallel methods for mining frequent patterns unstably perform for different database types and under-utilize the benefits of multi-core shared memory machines. We present ShaFEM, a novel parallel frequent pattern mining method, to address these issues. Our method can dynamically adapt to the data characteristics to efficiently perform on both sparse and dense databases. Its parallel mining lock free approach minimizes the synchronization needs and maximizes the data independence to enhance the scalability. Its structure lends itself well for dynamic job scheduling resulting in well-balanced load on new multi-core shared memory architectures. We evaluate ShaFEM on a 12-core multi-socket server and find that our method runs 2.1--5.8 times faster than the state-of-the-art parallel method. For some test cases, we have shown that ShaFEM saves 4.9 days and 12.8 hours of execution time over the compared method.\",\"PeriodicalId\":166804,\"journal\":{\"name\":\"International Symposium on Design and Implementation of Symbolic Computation Systems\",\"volume\":\"160 49 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Symposium on Design and Implementation of Symbolic Computation Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2534645.2534653\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Symposium on Design and Implementation of Symbolic Computation Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2534645.2534653","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 16

摘要

频繁模式挖掘是数据挖掘中的一个重要问题，有许多实际应用。当前用于挖掘频繁模式的并行方法在不同数据库类型下的性能不稳定，并且没有充分利用多核共享内存机器的优势。为了解决这些问题，我们提出了一种新的并行频繁模式挖掘方法ShaFEM。该方法可以动态适应数据特征，在稀疏数据库和密集数据库上都能有效地执行。它的并行挖掘无锁方法最大限度地减少了同步需求，最大限度地提高了数据独立性，增强了可扩展性。它的结构非常适合动态作业调度，从而在新的多核共享内存架构上实现良好的负载平衡。我们在12核多套接字服务器上评估ShaFEM，发现我们的方法比最先进的并行方法快2.1- 5.8倍。对于一些测试用例，我们已经证明ShaFEM比比较的方法节省了4.9天和12.8小时的执行时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Novel parallel method for mining frequent patterns on multi-core shared memory systems

Frequent pattern mining is an important problem in data mining with many practical applications. Current parallel methods for mining frequent patterns unstably perform for different database types and under-utilize the benefits of multi-core shared memory machines. We present ShaFEM, a novel parallel frequent pattern mining method, to address these issues. Our method can dynamically adapt to the data characteristics to efficiently perform on both sparse and dense databases. Its parallel mining lock free approach minimizes the synchronization needs and maximizes the data independence to enhance the scalability. Its structure lends itself well for dynamic job scheduling resulting in well-balanced load on new multi-core shared memory architectures. We evaluate ShaFEM on a 12-core multi-socket server and find that our method runs 2.1--5.8 times faster than the state-of-the-art parallel method. For some test cases, we have shown that ShaFEM saves 4.9 days and 12.8 hours of execution time over the compared method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Symposium on Design and Implementation of Symbolic Computation Systems

自引率

0.00%

发文量