{"title":"Runtime techniques for efficient Ray-Tracing on heterogeneous systems","authors":"Chih-Chen Kao, W. Hsu","doi":"10.1109/ICDSP.2015.7251838","DOIUrl":null,"url":null,"abstract":"The prevalence of real time multimedia delivery appliances has led to the developments of a wide variety of efficient architectures and supporting software techniques. Specifically, Ray-Tracing, a well-known physically-based rendering algorithm, has been receiving great attentions in research and development with the evolution of multi-core architecture since massive parallelism is inherent in that application. Unfortunately, the type of computation in Ray-tracing is known as an instance of irregular application which possesses attributes that may vary during execution and are often unpredictable, making it difficult to run efficiently on SIMD/SIMT based GPGPU architectures. For example, the irregularity in such applications may cause control flow divergence, load imbalance and low efficiency in the memory hierarchy of heterogeneous computing systems. To address these issues, researchers have been trying different approaches such as MIMD based homogeneous platform or specific hardware solutions. While these approaches tend to emphasize on dedicated special-purpose hardware configurations, our work illustrates that with appropriate analysis and tuning for irregularity within Ray-Tracing, it is possible to achieve high performance and high efficiency on current heterogeneous systems by applying software-based runtime approach. We studied and proposed phase guided dynamic work partitioning, a light-weight and fast analysis technique, to collect information during program phases at runtime in order to guide work partitioning in subsequent phases for more efficient work dispatching on heterogeneous systems. The experiments have shown that the performance gain of this approach can be as high as 5 times faster than the original system.","PeriodicalId":216293,"journal":{"name":"2015 IEEE International Conference on Digital Signal Processing (DSP)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Digital Signal Processing (DSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDSP.2015.7251838","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
The prevalence of real time multimedia delivery appliances has led to the developments of a wide variety of efficient architectures and supporting software techniques. Specifically, Ray-Tracing, a well-known physically-based rendering algorithm, has been receiving great attentions in research and development with the evolution of multi-core architecture since massive parallelism is inherent in that application. Unfortunately, the type of computation in Ray-tracing is known as an instance of irregular application which possesses attributes that may vary during execution and are often unpredictable, making it difficult to run efficiently on SIMD/SIMT based GPGPU architectures. For example, the irregularity in such applications may cause control flow divergence, load imbalance and low efficiency in the memory hierarchy of heterogeneous computing systems. To address these issues, researchers have been trying different approaches such as MIMD based homogeneous platform or specific hardware solutions. While these approaches tend to emphasize on dedicated special-purpose hardware configurations, our work illustrates that with appropriate analysis and tuning for irregularity within Ray-Tracing, it is possible to achieve high performance and high efficiency on current heterogeneous systems by applying software-based runtime approach. We studied and proposed phase guided dynamic work partitioning, a light-weight and fast analysis technique, to collect information during program phases at runtime in order to guide work partitioning in subsequent phases for more efficient work dispatching on heterogeneous systems. The experiments have shown that the performance gain of this approach can be as high as 5 times faster than the original system.