{"title":"A 186Mvertices/s 161mW Floating-Point Vertex Processor for Mobile Graphics Systems","authors":"Chang-Hyo Yu, Kyusik Chung, Donghyun Kim, L. Kim","doi":"10.1109/CICC.2007.4405798","DOIUrl":null,"url":null,"abstract":"In this paper, a power-efficient vertex processor with a geometry-specific arithmetic unit, vertex caches, and a vertex texturing unit is presented for mobile graphics environments. The arithmetic unit takes advantages of the geometry operations; a four-threaded and four-issue expanded VLIW datapath with a quad-float vertex texture fetcher. The vertex caches are optimized to acquire higher power efficiency. Moreover, an instruction-level power control method is adopted with an operand sharing and writeback re-allocation methods as well as operand isolations and gated clocks. The proposed vertex processor achieves 186 Mvertices/s of geometry performance which is 1.6 times faster than the previous results which adopt the IEEE754-compliant arithmetic units, and it supports OpenGL ES 2.0 and Vertex Shader Model 3.0. The processor is implemented in a 0.18-mum 1P4M CMOS process.","PeriodicalId":130106,"journal":{"name":"2007 IEEE Custom Integrated Circuits Conference","volume":"154 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE Custom Integrated Circuits Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CICC.2007.4405798","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
In this paper, a power-efficient vertex processor with a geometry-specific arithmetic unit, vertex caches, and a vertex texturing unit is presented for mobile graphics environments. The arithmetic unit takes advantages of the geometry operations; a four-threaded and four-issue expanded VLIW datapath with a quad-float vertex texture fetcher. The vertex caches are optimized to acquire higher power efficiency. Moreover, an instruction-level power control method is adopted with an operand sharing and writeback re-allocation methods as well as operand isolations and gated clocks. The proposed vertex processor achieves 186 Mvertices/s of geometry performance which is 1.6 times faster than the previous results which adopt the IEEE754-compliant arithmetic units, and it supports OpenGL ES 2.0 and Vertex Shader Model 3.0. The processor is implemented in a 0.18-mum 1P4M CMOS process.
本文提出了一种具有几何特定运算单元、顶点缓存和顶点纹理单元的高效顶点处理器,用于移动图形环境。算术单元利用几何运算的优势;一个带有四浮点顶点纹理获取器的四线程和四问题扩展的VLIW数据路径。顶点缓存被优化以获得更高的功率效率。采用指令级功率控制方法,采用操作数共享和回写重分配方法以及操作数隔离和门控时钟。所提出的顶点处理器实现了186 Mvertices/s的几何性能,比之前采用ieee754算法单元的结果快1.6倍,并且支持OpenGL ES 2.0和vertex Shader Model 3.0。该处理器采用0.18 μ m 1P4M CMOS工艺实现。