一种光线追踪ASIC设计的性能评估

2006 IEEE Symposium on Interactive Ray Tracing Pub Date : 2006-09-01 DOI:10.1109/RT.2006.280209

Sven Woop, E. Brunvand, P. Slusallek

{"title":"一种光线追踪ASIC设计的性能评估","authors":"Sven Woop, E. Brunvand, P. Slusallek","doi":"10.1109/RT.2006.280209","DOIUrl":null,"url":null,"abstract":"Recursive ray tracing is a powerful rendering technique used to compute realistic images by simulating the global light transport in a scene. Algorithmic improvements and FPGA-based hardware implementations of ray tracing have demonstrated realtime performance but hardware that achieves performance levels comparable to commodity rasterization graphics chips is still not available. This paper describes the architecture and ASIC implementations of the DRPU design (dynamic ray processing unit) that closes this performance gap. The DRPU supports fully programmable shading and most kinds of dynamic scenes and thus provides similar capabilities as current GPUs. It achieves high efficiency due to SIMD processing of floating point vectors, massive multithreading, synchronous execution of packets of threads, and careful management of caches for scene data. To support dynamic scenes B-KD trees are used as spatial index structures that are processed by a custom traversal and intersection unit and modified by an update processor on scene changes. The DRPU architecture is specified as a high-level structural description in a functional language and mapped to both FPGA and ASIC implementations. Our FPGA prototype clocked at 66 MHz achieves higher ray tracing performance than CPU-based ray tracers even on a modern multi-GHz CPU. We provide performance results for two 130 nm ASIC versions and estimate what performance would be using a 90 nm CMOS process. For a 90nm version with a 196 mm2 die we conservatively estimate clock rates of 400 MHz and ray tracing performance of 80 to 290 fps at 1024times768 resolution in our test scenes. This estimated performance is 70 times faster than what is achievable with standard multi-GHz desktop CPUs","PeriodicalId":158017,"journal":{"name":"2006 IEEE Symposium on Interactive Ray Tracing","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"37","resultStr":"{\"title\":\"Estimating Performance of a Ray-Tracing ASIC Design\",\"authors\":\"Sven Woop, E. Brunvand, P. Slusallek\",\"doi\":\"10.1109/RT.2006.280209\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recursive ray tracing is a powerful rendering technique used to compute realistic images by simulating the global light transport in a scene. Algorithmic improvements and FPGA-based hardware implementations of ray tracing have demonstrated realtime performance but hardware that achieves performance levels comparable to commodity rasterization graphics chips is still not available. This paper describes the architecture and ASIC implementations of the DRPU design (dynamic ray processing unit) that closes this performance gap. The DRPU supports fully programmable shading and most kinds of dynamic scenes and thus provides similar capabilities as current GPUs. It achieves high efficiency due to SIMD processing of floating point vectors, massive multithreading, synchronous execution of packets of threads, and careful management of caches for scene data. To support dynamic scenes B-KD trees are used as spatial index structures that are processed by a custom traversal and intersection unit and modified by an update processor on scene changes. The DRPU architecture is specified as a high-level structural description in a functional language and mapped to both FPGA and ASIC implementations. Our FPGA prototype clocked at 66 MHz achieves higher ray tracing performance than CPU-based ray tracers even on a modern multi-GHz CPU. We provide performance results for two 130 nm ASIC versions and estimate what performance would be using a 90 nm CMOS process. For a 90nm version with a 196 mm2 die we conservatively estimate clock rates of 400 MHz and ray tracing performance of 80 to 290 fps at 1024times768 resolution in our test scenes. This estimated performance is 70 times faster than what is achievable with standard multi-GHz desktop CPUs\",\"PeriodicalId\":158017,\"journal\":{\"name\":\"2006 IEEE Symposium on Interactive Ray Tracing\",\"volume\":\"59 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"37\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2006 IEEE Symposium on Interactive Ray Tracing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RT.2006.280209\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 IEEE Symposium on Interactive Ray Tracing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RT.2006.280209","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 37

摘要

递归光线追踪是一种强大的渲染技术，用于通过模拟场景中的全局光传输来计算逼真的图像。光线追踪的算法改进和基于fpga的硬件实现已经证明了实时性能，但实现与商品光栅化图形芯片相当的性能水平的硬件仍然不可用。本文描述了DRPU设计(动态射线处理单元)的体系结构和ASIC实现，以缩小这一性能差距。DRPU支持完全可编程的着色和大多数类型的动态场景，因此提供与当前gpu类似的功能。由于浮点向量的SIMD处理、大规模多线程、线程数据包的同步执行以及对场景数据缓存的精心管理，它实现了高效率。为了支持动态场景，B-KD树被用作空间索引结构，由自定义遍历和交叉单元处理，并在场景变化时由更新处理器修改。DRPU架构被指定为用函数式语言描述的高级结构，并映射到FPGA和ASIC实现。我们的FPGA原型时钟为66 MHz，即使在现代多ghz CPU上，也比基于CPU的光线跟踪器实现更高的光线跟踪性能。我们提供了两个130纳米ASIC版本的性能结果，并估计了使用90纳米CMOS工艺的性能。对于带有196mm2芯片的90nm版本，我们保守估计时钟速率为400mhz，在1024times768分辨率下，我们的测试场景中光线追踪性能为80至290 fps。这个估计的性能比标准的多ghz桌面cpu快70倍

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Estimating Performance of a Ray-Tracing ASIC Design

Recursive ray tracing is a powerful rendering technique used to compute realistic images by simulating the global light transport in a scene. Algorithmic improvements and FPGA-based hardware implementations of ray tracing have demonstrated realtime performance but hardware that achieves performance levels comparable to commodity rasterization graphics chips is still not available. This paper describes the architecture and ASIC implementations of the DRPU design (dynamic ray processing unit) that closes this performance gap. The DRPU supports fully programmable shading and most kinds of dynamic scenes and thus provides similar capabilities as current GPUs. It achieves high efficiency due to SIMD processing of floating point vectors, massive multithreading, synchronous execution of packets of threads, and careful management of caches for scene data. To support dynamic scenes B-KD trees are used as spatial index structures that are processed by a custom traversal and intersection unit and modified by an update processor on scene changes. The DRPU architecture is specified as a high-level structural description in a functional language and mapped to both FPGA and ASIC implementations. Our FPGA prototype clocked at 66 MHz achieves higher ray tracing performance than CPU-based ray tracers even on a modern multi-GHz CPU. We provide performance results for two 130 nm ASIC versions and estimate what performance would be using a 90 nm CMOS process. For a 90nm version with a 196 mm2 die we conservatively estimate clock rates of 400 MHz and ray tracing performance of 80 to 290 fps at 1024times768 resolution in our test scenes. This estimated performance is 70 times faster than what is achievable with standard multi-GHz desktop CPUs

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2006 IEEE Symposium on Interactive Ray Tracing

自引率

0.00%

发文量