Jinyong Lee, Seungjun Yang, Sanghyun Park, Ingoo Heo, Y. Paek
{"title":"用于H.264的VLIW处理器:整数变换和量化","authors":"Jinyong Lee, Seungjun Yang, Sanghyun Park, Ingoo Heo, Y. Paek","doi":"10.1109/SOCDC.2010.5682944","DOIUrl":null,"url":null,"abstract":"As the performance of mobile devices increases, a demand for watching high quality videos in those devices also increases. VLIW (Very Long Instruction Word) processors have been used as a coprocessor to accelerate the performance of various CODECs in the embedded systems, e.g. TI Davinci, but the general VLIW has too much redundancies if the applications required to be executed on the VLIW are restricted. In this paper, we propose a VLIW processor focused on the DCT and Quantization of H.264. Our proposed architecture has 4 issue slots and 16 bit width data path which is half of the TI's TMS320C6× series, but performs better than the TMS320C6× series in terms of cycle count and throughput.","PeriodicalId":380183,"journal":{"name":"2010 International SoC Design Conference","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"VLIW processor for H.264: Integer transform and Quantization\",\"authors\":\"Jinyong Lee, Seungjun Yang, Sanghyun Park, Ingoo Heo, Y. Paek\",\"doi\":\"10.1109/SOCDC.2010.5682944\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As the performance of mobile devices increases, a demand for watching high quality videos in those devices also increases. VLIW (Very Long Instruction Word) processors have been used as a coprocessor to accelerate the performance of various CODECs in the embedded systems, e.g. TI Davinci, but the general VLIW has too much redundancies if the applications required to be executed on the VLIW are restricted. In this paper, we propose a VLIW processor focused on the DCT and Quantization of H.264. Our proposed architecture has 4 issue slots and 16 bit width data path which is half of the TI's TMS320C6× series, but performs better than the TMS320C6× series in terms of cycle count and throughput.\",\"PeriodicalId\":380183,\"journal\":{\"name\":\"2010 International SoC Design Conference\",\"volume\":\"69 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 International SoC Design Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SOCDC.2010.5682944\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 International SoC Design Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SOCDC.2010.5682944","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
随着移动设备性能的提高,在这些设备上观看高质量视频的需求也在增加。VLIW (Very Long Instruction Word)处理器已经被用作协处理器来加速嵌入式系统中各种编解码器的性能,例如TI达芬奇,但是如果需要在VLIW上执行的应用程序受到限制,一般的VLIW有太多的冗余。本文提出了一种以H.264的DCT和量化为核心的VLIW处理器。我们提出的架构有4个问题插槽和16位宽度的数据路径,这是TI的tms320c6x系列的一半,但在周期计数和吞吐量方面表现优于tms320c6x系列。
VLIW processor for H.264: Integer transform and Quantization
As the performance of mobile devices increases, a demand for watching high quality videos in those devices also increases. VLIW (Very Long Instruction Word) processors have been used as a coprocessor to accelerate the performance of various CODECs in the embedded systems, e.g. TI Davinci, but the general VLIW has too much redundancies if the applications required to be executed on the VLIW are restricted. In this paper, we propose a VLIW processor focused on the DCT and Quantization of H.264. Our proposed architecture has 4 issue slots and 16 bit width data path which is half of the TI's TMS320C6× series, but performs better than the TMS320C6× series in terms of cycle count and throughput.