T. Machino, Shintaro Iwazaki, Y. Okuyama, J. Kitamichi, Kenichi Kuroda, R. Oka
{"title":"蜂窝宽带引擎处理器的二维连续动态规划优化","authors":"T. Machino, Shintaro Iwazaki, Y. Okuyama, J. Kitamichi, Kenichi Kuroda, R. Oka","doi":"10.1109/FCST.2008.35","DOIUrl":null,"url":null,"abstract":"Two-dimensional continuous dynamic programming (2DCDP), a specialized DP matching method for image recognition, can be applied to many applications such as object tracking, pattern matching, etc. However, the execution time is large, and the current general purpose processor does not achieve performance in real-time. In this paper, we present our approach to real-time image recognition using a cell broadband engine processor (Cell processor). We optimize 2DCDP for the cell processor by vectorizing with SIMD instructions, parallelizing with multiple SPEs, dynamic branch prediction in assembly level, and so on. Finally, the performance on the Cell processor is achieved over 15 times faster than the performance on an Intel Xeon 5160 processor.","PeriodicalId":206207,"journal":{"name":"2008 Japan-China Joint Workshop on Frontier of Computer Science and Technology","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Optimizing Two-Dimensional Continuous Dynamic Programming for Cell Broadband Engine Processors\",\"authors\":\"T. Machino, Shintaro Iwazaki, Y. Okuyama, J. Kitamichi, Kenichi Kuroda, R. Oka\",\"doi\":\"10.1109/FCST.2008.35\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Two-dimensional continuous dynamic programming (2DCDP), a specialized DP matching method for image recognition, can be applied to many applications such as object tracking, pattern matching, etc. However, the execution time is large, and the current general purpose processor does not achieve performance in real-time. In this paper, we present our approach to real-time image recognition using a cell broadband engine processor (Cell processor). We optimize 2DCDP for the cell processor by vectorizing with SIMD instructions, parallelizing with multiple SPEs, dynamic branch prediction in assembly level, and so on. Finally, the performance on the Cell processor is achieved over 15 times faster than the performance on an Intel Xeon 5160 processor.\",\"PeriodicalId\":206207,\"journal\":{\"name\":\"2008 Japan-China Joint Workshop on Frontier of Computer Science and Technology\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-12-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 Japan-China Joint Workshop on Frontier of Computer Science and Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FCST.2008.35\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 Japan-China Joint Workshop on Frontier of Computer Science and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FCST.2008.35","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Optimizing Two-Dimensional Continuous Dynamic Programming for Cell Broadband Engine Processors
Two-dimensional continuous dynamic programming (2DCDP), a specialized DP matching method for image recognition, can be applied to many applications such as object tracking, pattern matching, etc. However, the execution time is large, and the current general purpose processor does not achieve performance in real-time. In this paper, we present our approach to real-time image recognition using a cell broadband engine processor (Cell processor). We optimize 2DCDP for the cell processor by vectorizing with SIMD instructions, parallelizing with multiple SPEs, dynamic branch prediction in assembly level, and so on. Finally, the performance on the Cell processor is achieved over 15 times faster than the performance on an Intel Xeon 5160 processor.