{"title":"Energy and Performance Characterization of Mobile Heterogeneous Computing","authors":"Yi-Chu Wang, K. Cheng","doi":"10.1109/SiPS.2012.61","DOIUrl":null,"url":null,"abstract":"A modern mobile application processor is a heterogeneous multi-core SoC which integrates CPU and application-specific accelerators such as GPU and DSP. It provides opportunity to accelerate other compute-intensive applications, yet mapping an algorithm to such a heterogeneous platform is not a straightforward task and has many design decisions to make. In this paper, we evaluate the performance and energy benefits of utilizing the integrated GPU and DSP cores to offload or share CPU's compute-intensive tasks. The evaluation is conducted on three representative mobile platforms, TI's OMAP3530, Qualcomn's Snapdragon S2, and Nvidia's Tegra2, using common computation tasks in mobile applications. We identify key factors that should be considered in energy-optimized mobile heterogeneous computing. Our evaluation results show that, by effectively utilizing all the computing cores concurrently, an average of 3.7X performance improvement can be achieved with the cost of 33% more power consumption, in comparison with the case of utilizing CPU only. This stands for 2.8X energy saving.","PeriodicalId":286060,"journal":{"name":"2012 IEEE Workshop on Signal Processing Systems","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE Workshop on Signal Processing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SiPS.2012.61","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
A modern mobile application processor is a heterogeneous multi-core SoC which integrates CPU and application-specific accelerators such as GPU and DSP. It provides opportunity to accelerate other compute-intensive applications, yet mapping an algorithm to such a heterogeneous platform is not a straightforward task and has many design decisions to make. In this paper, we evaluate the performance and energy benefits of utilizing the integrated GPU and DSP cores to offload or share CPU's compute-intensive tasks. The evaluation is conducted on three representative mobile platforms, TI's OMAP3530, Qualcomn's Snapdragon S2, and Nvidia's Tegra2, using common computation tasks in mobile applications. We identify key factors that should be considered in energy-optimized mobile heterogeneous computing. Our evaluation results show that, by effectively utilizing all the computing cores concurrently, an average of 3.7X performance improvement can be achieved with the cost of 33% more power consumption, in comparison with the case of utilizing CPU only. This stands for 2.8X energy saving.