{"title":"A chip set for a ray-casting engine","authors":"G. Hekstra, E. Deprettere","doi":"10.1109/VLSISP.1996.558335","DOIUrl":null,"url":null,"abstract":"Rendering artificial scenes is an appealing example of a class of problems leading to complex data dependent algorithms for which efficient software/hardware mapping techniques have to be envisaged. We present one of the ASICs in our rendering system to illustrate our design methodology in more detail. The first step in the algorithm-architecture design is to reformulate an existing naive algorithm in such a way that, as much as possible, only significant operations are performed. The resulting algorithm has a nested loop structure, with non-manifest, data-dependent loop bounds, rendering classical techniques for parallelisation useless. The second step is to greatly reduce the overall computation time of the algorithm by reducing the computational complexity of the innermost loop operation. The third and last step is to map this algorithm on a pipelined architecture, where the pipeline stages-functional units within an ASIC-implement different loop levels. Due to the data dependent nature, the functional units that implement the parts of the loops are time-varying with regard to both execution time and in how much data is produced for the following pipeline stages. Since the execution times of the various pipeline stages are changing, so does the location of the bottleneck over time. Hence the goal is not to keep all pipeline stages continually busy, but to keep the throughput at the most critical innermost loop operation as high as possible.","PeriodicalId":290885,"journal":{"name":"VLSI Signal Processing, IX","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1996-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"VLSI Signal Processing, IX","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VLSISP.1996.558335","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Rendering artificial scenes is an appealing example of a class of problems leading to complex data dependent algorithms for which efficient software/hardware mapping techniques have to be envisaged. We present one of the ASICs in our rendering system to illustrate our design methodology in more detail. The first step in the algorithm-architecture design is to reformulate an existing naive algorithm in such a way that, as much as possible, only significant operations are performed. The resulting algorithm has a nested loop structure, with non-manifest, data-dependent loop bounds, rendering classical techniques for parallelisation useless. The second step is to greatly reduce the overall computation time of the algorithm by reducing the computational complexity of the innermost loop operation. The third and last step is to map this algorithm on a pipelined architecture, where the pipeline stages-functional units within an ASIC-implement different loop levels. Due to the data dependent nature, the functional units that implement the parts of the loops are time-varying with regard to both execution time and in how much data is produced for the following pipeline stages. Since the execution times of the various pipeline stages are changing, so does the location of the bottleneck over time. Hence the goal is not to keep all pipeline stages continually busy, but to keep the throughput at the most critical innermost loop operation as high as possible.