F. D. Smedt, Lars Struyf, Sander Beckers, Joost Vennekens, G. D. Samblanx, T. Goedemé
{"title":"游戏值得蜡烛吗?-评价OpenCL的目标检测算法优化","authors":"F. D. Smedt, Lars Struyf, Sander Beckers, Joost Vennekens, G. D. Samblanx, T. Goedemé","doi":"10.5220/0003821002840291","DOIUrl":null,"url":null,"abstract":"In this paper we present out experiences with the implementation of an object detector using OpenCL. With this implementation we fullfil the need for fast and robust object detection, necessary in many applications in multiple domains (surveillance, traffic, image retrieval, ...). The algorithm lends itself to be implemented in a parallel way. We exploit this opportunity by implementing it on a GPU. For this implementation, we have choosen to use the OpenCL programming language, since this allows for scalability to more performant and different types of hardware, with minimal changes to the implementation. We will discuss how the parallelization is done, and discuss the challenges we met. We will also discuss the experimental timing results we achieved and evaluate the ease-of-use of OpenCL.","PeriodicalId":298357,"journal":{"name":"International Conference on Pervasive and Embedded Computing and Communication Systems","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Is the Game worth the Candle? - Evaluation of OpenCL for Object Detection Algorithm Optimization\",\"authors\":\"F. D. Smedt, Lars Struyf, Sander Beckers, Joost Vennekens, G. D. Samblanx, T. Goedemé\",\"doi\":\"10.5220/0003821002840291\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we present out experiences with the implementation of an object detector using OpenCL. With this implementation we fullfil the need for fast and robust object detection, necessary in many applications in multiple domains (surveillance, traffic, image retrieval, ...). The algorithm lends itself to be implemented in a parallel way. We exploit this opportunity by implementing it on a GPU. For this implementation, we have choosen to use the OpenCL programming language, since this allows for scalability to more performant and different types of hardware, with minimal changes to the implementation. We will discuss how the parallelization is done, and discuss the challenges we met. We will also discuss the experimental timing results we achieved and evaluate the ease-of-use of OpenCL.\",\"PeriodicalId\":298357,\"journal\":{\"name\":\"International Conference on Pervasive and Embedded Computing and Communication Systems\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-02-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Pervasive and Embedded Computing and Communication Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5220/0003821002840291\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Pervasive and Embedded Computing and Communication Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0003821002840291","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Is the Game worth the Candle? - Evaluation of OpenCL for Object Detection Algorithm Optimization
In this paper we present out experiences with the implementation of an object detector using OpenCL. With this implementation we fullfil the need for fast and robust object detection, necessary in many applications in multiple domains (surveillance, traffic, image retrieval, ...). The algorithm lends itself to be implemented in a parallel way. We exploit this opportunity by implementing it on a GPU. For this implementation, we have choosen to use the OpenCL programming language, since this allows for scalability to more performant and different types of hardware, with minimal changes to the implementation. We will discuss how the parallelization is done, and discuss the challenges we met. We will also discuss the experimental timing results we achieved and evaluate the ease-of-use of OpenCL.