High Level Programming of Document Classification Systems for Heterogeneous Environments using OpenCL (Abstract Only)

Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays Pub Date : 2015-02-22 DOI:10.1145/2684746.2689136

Nasibeh Nasiri, Oren Segal, M. Margala, W. Vanderbauwhede, S. R. Chalamalasetti

{"title":"High Level Programming of Document Classification Systems for Heterogeneous Environments using OpenCL (Abstract Only)","authors":"Nasibeh Nasiri, Oren Segal, M. Margala, W. Vanderbauwhede, S. R. Chalamalasetti","doi":"10.1145/2684746.2689136","DOIUrl":null,"url":null,"abstract":"Document classification is at the heart of several of the applications that have been driving the proliferation of the internet in our daily lives. The ever growing amounts of data and the need for higher throughput, more energy efficient document classification solutions motivated us to investigate alternatives to the traditional homogenous CPU based implementations. We investigate a heterogeneous system where CPUs are combined with FPGAs as system accelerators. Incorporating FPGAs as accelerators in a heterogeneous computing environment allows for the creation of flexible custom hardware solutions that can potentially offer increased power efficiency and performance gains. One of the main issues delaying wide spread adoption of FPGAs as standard heterogeneous system accelerators is the difficulty in programming them. The OpenCL standard offers a unified C programming model for any device that adheres to its standards. An Altera OpenCL FPGA based implementation of a document classification system is investigated in which a stream of HTML documents is scored according to a profile on a document-by-document basis. The results show that the throughput of the document classification application with and without Bloom Filters is 312MB/s and 343MB/s respectively, when running on CPU, and 354MB/s and 452MB/s respectively, when running on an FPGA. Our results also show up to 32% power efficiency improvement for the FPGA implementation over the CPU implementation. We would like to thank Davor Capalija from Altera for his invaluable advice during our work on the FPGA version of the algorithm.","PeriodicalId":388546,"journal":{"name":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2684746.2689136","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Document classification is at the heart of several of the applications that have been driving the proliferation of the internet in our daily lives. The ever growing amounts of data and the need for higher throughput, more energy efficient document classification solutions motivated us to investigate alternatives to the traditional homogenous CPU based implementations. We investigate a heterogeneous system where CPUs are combined with FPGAs as system accelerators. Incorporating FPGAs as accelerators in a heterogeneous computing environment allows for the creation of flexible custom hardware solutions that can potentially offer increased power efficiency and performance gains. One of the main issues delaying wide spread adoption of FPGAs as standard heterogeneous system accelerators is the difficulty in programming them. The OpenCL standard offers a unified C programming model for any device that adheres to its standards. An Altera OpenCL FPGA based implementation of a document classification system is investigated in which a stream of HTML documents is scored according to a profile on a document-by-document basis. The results show that the throughput of the document classification application with and without Bloom Filters is 312MB/s and 343MB/s respectively, when running on CPU, and 354MB/s and 452MB/s respectively, when running on an FPGA. Our results also show up to 32% power efficiency improvement for the FPGA implementation over the CPU implementation. We would like to thank Davor Capalija from Altera for his invaluable advice during our work on the FPGA version of the algorithm.

查看原文本刊更多论文

基于OpenCL的异构环境下文档分类系统的高级编程(仅摘要)

文档分类是一些应用程序的核心，这些应用程序推动了互联网在我们日常生活中的扩散。不断增长的数据量和对更高吞吐量、更节能的文档分类解决方案的需求促使我们研究传统的基于同构CPU的实现的替代方案。我们研究了一个异构系统，其中cpu与fpga结合作为系统加速器。将fpga作为加速器集成到异构计算环境中，可以创建灵活的定制硬件解决方案，从而可能提供更高的功率效率和性能收益。延迟fpga作为标准异构系统加速器广泛采用的主要问题之一是编程困难。OpenCL标准为遵守其标准的任何设备提供了统一的C编程模型。本文研究了一种基于Altera OpenCL FPGA的文档分类系统的实现，在该系统中，HTML文档流根据逐个文档的配置文件进行评分。结果表明，使用和不使用布隆滤波器的文档分类应用程序在CPU上运行时的吞吐量分别为312MB/s和343MB/s，在FPGA上运行时的吞吐量分别为354MB/s和452MB/s。我们的结果还显示，与CPU实现相比，FPGA实现的功率效率提高了32%。我们要感谢Altera的Davor Capalija在我们FPGA版本算法的工作期间提供的宝贵建议。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

自引率

0.00%

发文量