Ichiro Kuroda, Eri Murata, Kouhei, Nadehara, Kazumasa Suzukit, T. Arai, Atsushi Okamurat
{"title":"多媒体RISC处理器的16位并行MAC架构","authors":"Ichiro Kuroda, Eri Murata, Kouhei, Nadehara, Kazumasa Suzukit, T. Arai, Atsushi Okamurat","doi":"10.1109/SIPS.1998.715773","DOIUrl":null,"url":null,"abstract":"This paper presents a parallel MAC (multiply-accumulation) architecture designed for DSP applications on a 200-MHz, 1.6-GOPS multimedia RISC processor. The datapath architecture of the processor is designed to realize parallel execution of a data transfer and SIMD parallel arithmetic operations. SIMD parallel 16-bit MAC instructions are introduced with a symmetric rounding scheme which maximizes the accuracy of the 18-bit accumulation. This parallel 16-bit MAC instruction on a 64-bit datapath is shown to be efficiently utilized for DSP applications such as convolution in the multimedia RISC processor. By using the parallel MAC instruction with the symmetric rounding scheme, the two-dimensional inverse discrete cosine transform (2D-IDCT) which satisfies IEEE 1180 can be implemented in 202 cycles.","PeriodicalId":151031,"journal":{"name":"1998 IEEE Workshop on Signal Processing Systems. SIPS 98. Design and Implementation (Cat. No.98TH8374)","volume":"78 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"A 16-bit parallel MAC architecture for a multimedia RISC processor\",\"authors\":\"Ichiro Kuroda, Eri Murata, Kouhei, Nadehara, Kazumasa Suzukit, T. Arai, Atsushi Okamurat\",\"doi\":\"10.1109/SIPS.1998.715773\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a parallel MAC (multiply-accumulation) architecture designed for DSP applications on a 200-MHz, 1.6-GOPS multimedia RISC processor. The datapath architecture of the processor is designed to realize parallel execution of a data transfer and SIMD parallel arithmetic operations. SIMD parallel 16-bit MAC instructions are introduced with a symmetric rounding scheme which maximizes the accuracy of the 18-bit accumulation. This parallel 16-bit MAC instruction on a 64-bit datapath is shown to be efficiently utilized for DSP applications such as convolution in the multimedia RISC processor. By using the parallel MAC instruction with the symmetric rounding scheme, the two-dimensional inverse discrete cosine transform (2D-IDCT) which satisfies IEEE 1180 can be implemented in 202 cycles.\",\"PeriodicalId\":151031,\"journal\":{\"name\":\"1998 IEEE Workshop on Signal Processing Systems. SIPS 98. Design and Implementation (Cat. No.98TH8374)\",\"volume\":\"78 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1998-10-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"1998 IEEE Workshop on Signal Processing Systems. SIPS 98. Design and Implementation (Cat. No.98TH8374)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SIPS.1998.715773\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"1998 IEEE Workshop on Signal Processing Systems. SIPS 98. Design and Implementation (Cat. No.98TH8374)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIPS.1998.715773","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A 16-bit parallel MAC architecture for a multimedia RISC processor
This paper presents a parallel MAC (multiply-accumulation) architecture designed for DSP applications on a 200-MHz, 1.6-GOPS multimedia RISC processor. The datapath architecture of the processor is designed to realize parallel execution of a data transfer and SIMD parallel arithmetic operations. SIMD parallel 16-bit MAC instructions are introduced with a symmetric rounding scheme which maximizes the accuracy of the 18-bit accumulation. This parallel 16-bit MAC instruction on a 64-bit datapath is shown to be efficiently utilized for DSP applications such as convolution in the multimedia RISC processor. By using the parallel MAC instruction with the symmetric rounding scheme, the two-dimensional inverse discrete cosine transform (2D-IDCT) which satisfies IEEE 1180 can be implemented in 202 cycles.