GPGPU-3 Pub Date : 2010-03-14 DOI: 10.1145/1735688.1735707

Malak Alshawabkeh, B. Jang, D. Kaeli

{"title":"Accelerating the local outlier factor algorithm on a GPU for intrusion detection systems","authors":"Malak Alshawabkeh, B. Jang, D. Kaeli","doi":"10.1145/1735688.1735707","DOIUrl":"https://doi.org/10.1145/1735688.1735707","url":null,"abstract":"The Local Outlier Factor (LOF) is a very powerful anomaly detection method available in machine learning and classification. The algorithm defines the notion of local outlier in which the degree to which an object is outlying is dependent on the density of its local neighborhood, and each object can be assigned an LOF which represents the likelihood of that object being an outlier. Although this concept of a local outlier is a useful one, the computation of LOF values for every data object requires a large number of k-nearest neighbor queries -- this overhead can limit the use of LOF due to the computational overhead involved.\u0000 Due to the growing popularity of Graphics Processing Units (GPU) in general-purpose computing domains, and equipped with a high-level programming language designed specifically for general-purpose applications (e.g., CUDA), we look to apply this parallel computing approach to accelerate LOF. In this paper we explore how to utilize a CUDA-based GPU implementation of the k-nearest neighbor algorithm to accelerate LOF classification. We achieve more than a 100X speedup over a multi-threaded dual-core CPU implementation. We also consider the impact of input data set size, the neighborhood size (i.e., the value of k) and the feature space dimension, and report on their impact on execution time.","PeriodicalId":381071,"journal":{"name":"GPGPU-3","volume":"248 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133682898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 56

Implementing the PGI Accelerator model 实现PGI加速器模型

GPGPU-3 Pub Date : 2010-03-14 DOI: 10.1145/1735688.1735697

M. Wolfe

引用次数: 186

Accelerating SQL database operations on a GPU with CUDA 在带有CUDA的GPU上加速SQL数据库操作

GPGPU-3 Pub Date : 2010-03-14 DOI: 10.1145/1735688.1735706

P. Bakkum, K. Skadron

引用次数: 284

A mapping path for multi-GPGPU accelerated computers from a portable high level programming abstraction 从可移植的高级编程抽象实现多gpgpu加速计算机的映射路径

GPGPU-3 Pub Date : 2010-03-14 DOI: 10.1145/1735688.1735698

Allen Leung, Nicolas Vasilache, Benoît Meister, M. Baskaran, David Wohlford, C. Bastoul, R. Lethin

{"title":"A mapping path for multi-GPGPU accelerated computers from a portable high level programming abstraction","authors":"Allen Leung, Nicolas Vasilache, Benoît Meister, M. Baskaran, David Wohlford, C. Bastoul, R. Lethin","doi":"10.1145/1735688.1735698","DOIUrl":"https://doi.org/10.1145/1735688.1735698","url":null,"abstract":"Programmers for GPGPU face rapidly changing substrate of programming abstractions, execution models, and hardware implementations. It has been established, through numerous demonstrations for particular conjunctions of application kernel, programming languages, and GPU hardware instance, that it is possible to achieve significant improvements in the price/performance and energy/performance over general purpose processors. But these demonstrations are each the result of significant dedicated programmer labor, which is likely to be duplicated for each new GPU hardware architecture to achieve performance portability.\u0000 This paper discusses the implementation, in the R-Stream compiler, of a source to source mapping pathway from a high-level, textbook-style algorithm expression method in ANSI C, to multi-GPGPU accelerated computers. The compiler performs hierarchical decomposition and parallelization of the algorithm between and across host, multiple GPGPUs, and within-GPU. The semantic transformations are expressed within the polyhedral model, including optimization of integrated parallelization, locality, and contiguity tradeoffs. Hierarchical tiling is performed. Communication and synchronizations operations at multiple levels are generated automatically. The resulting mapping is currently emitted in the CUDA programming language.\u0000 The GPU backend adds to the range of hardware and accelerator targets for R-Stream and indicates the potential for performance portability of single sources across multiple hardware targets.","PeriodicalId":381071,"journal":{"name":"GPGPU-3","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132611720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 87

GPGPU-3最新文献