Jeffrey D. Blanchard, Erik Opavsky, Emircan Uysaler
{"title":"Selecting Multiple Order Statistics with a Graphics Processing Unit","authors":"Jeffrey D. Blanchard, Erik Opavsky, Emircan Uysaler","doi":"10.1145/2948974","DOIUrl":null,"url":null,"abstract":"Extracting a set of multiple order statistics from a huge data set provides important information about the distribution of the values in the full set of data. This article introduces an algorithm, bucketMultiSelect, for simultaneously selecting multiple order statistics with a graphics processing unit (GPU). Typically, when a large set of order statistics is desired, the vector is sorted. When the sorted version of the vector is not needed, bucketMultiSelect significantly reduces computation time by eliminating a large portion of the unnecessary operations involved in sorting. For large vectors, bucketMultiSelect returns thousands of order statistics in less time than sorting the vector while typically using less memory. For vectors containing 228 values of type double, bucketMultiSelect selects the 101 percentile order statistics in less than 95ms and is more than 8× faster than sorting the vector with a GPU optimized merge sort.","PeriodicalId":42115,"journal":{"name":"ACM Transactions on Parallel Computing","volume":"6 1","pages":"10:1-10:23"},"PeriodicalIF":0.9000,"publicationDate":"2016-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Parallel Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2948974","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 1
Abstract
Extracting a set of multiple order statistics from a huge data set provides important information about the distribution of the values in the full set of data. This article introduces an algorithm, bucketMultiSelect, for simultaneously selecting multiple order statistics with a graphics processing unit (GPU). Typically, when a large set of order statistics is desired, the vector is sorted. When the sorted version of the vector is not needed, bucketMultiSelect significantly reduces computation time by eliminating a large portion of the unnecessary operations involved in sorting. For large vectors, bucketMultiSelect returns thousands of order statistics in less time than sorting the vector while typically using less memory. For vectors containing 228 values of type double, bucketMultiSelect selects the 101 percentile order statistics in less than 95ms and is more than 8× faster than sorting the vector with a GPU optimized merge sort.