HPSVM: Heterogeneous Parallel SVM with Factorization Based IPM Algorithm on CPU-GPU Cluster

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP) Pub Date : 2016-02-01 DOI:10.1109/PDP.2016.29

Tao Li, Xuechen Liu, Qiankun Dong, Wenjing Ma, Kai Wang

{"title":"HPSVM: Heterogeneous Parallel SVM with Factorization Based IPM Algorithm on CPU-GPU Cluster","authors":"Tao Li, Xuechen Liu, Qiankun Dong, Wenjing Ma, Kai Wang","doi":"10.1109/PDP.2016.29","DOIUrl":null,"url":null,"abstract":"Support vector machine (SVM) is a supervised method widely used in the statistical classification and regression analysis. SVM training can be solved via the interior point method (IPM) with the advantages of low storage, fast convergence and easy parallelization. However, it is still confronted with the challenges of training speed and memory use. In this paper, we propose a parallel primal-dual IPM algorithm based on the incomplete Cholesky factorization (ICF) for efficiently training large-scale SVMs, named HPSVM, on CPU-GPU cluster. Our approach is distinguished from earlier work in that it is specifically designed to take maximal advantage of the CPU-GPU collaborative computation with the dual buffers 3-stage pipeline mechanism, and efficiently handles large-scale training datasets. In HPSVM, the heterogeneous hierarchical memory is fully explored to alleviate the bottleneck for optimizing data transfer, and the programming paradigm is presented to build an efficient collaboration mechanism between CPU and GPU. Comprehensive experiments show that HPSVM is up to 11 times faster than the CPU version on real datasets.","PeriodicalId":192273,"journal":{"name":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDP.2016.29","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

Abstract

Support vector machine (SVM) is a supervised method widely used in the statistical classification and regression analysis. SVM training can be solved via the interior point method (IPM) with the advantages of low storage, fast convergence and easy parallelization. However, it is still confronted with the challenges of training speed and memory use. In this paper, we propose a parallel primal-dual IPM algorithm based on the incomplete Cholesky factorization (ICF) for efficiently training large-scale SVMs, named HPSVM, on CPU-GPU cluster. Our approach is distinguished from earlier work in that it is specifically designed to take maximal advantage of the CPU-GPU collaborative computation with the dual buffers 3-stage pipeline mechanism, and efficiently handles large-scale training datasets. In HPSVM, the heterogeneous hierarchical memory is fully explored to alleviate the bottleneck for optimizing data transfer, and the programming paradigm is presented to build an efficient collaboration mechanism between CPU and GPU. Comprehensive experiments show that HPSVM is up to 11 times faster than the CPU version on real datasets.

查看原文本刊更多论文

HPSVM: CPU-GPU集群上基于因子分解的异构并行支持向量机

支持向量机(SVM)是一种广泛应用于统计分类和回归分析的监督方法。支持向量机训练采用内点法(IPM)求解，具有存储空间小、收敛速度快、易于并行化等优点。然而，它仍然面临着训练速度和记忆使用的挑战。本文提出了一种基于不完全Cholesky分解(ICF)的并行原对偶IPM算法，用于在CPU-GPU集群上高效训练大规模支持向量机HPSVM。我们的方法不同于之前的工作，因为它是专门设计的，最大限度地利用CPU-GPU协同计算与双缓冲区3阶段流水线机制，并有效地处理大规模训练数据集。在HPSVM中，充分挖掘了异构分层内存，缓解了优化数据传输的瓶颈，并提出了编程范式，在CPU和GPU之间建立了高效的协作机制。综合实验表明，HPSVM在实际数据集上的速度是CPU版本的11倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)

自引率

0.00%

发文量