{"title":"特定应用程序处理器的体系结构综合的实现优化技术","authors":"M. Breternitz, John Paul Shen","doi":"10.1145/123465.123488","DOIUrl":null,"url":null,"abstract":"An architectu?’e synthesis method for the automated design of high-performance application-specific processors has been p?’oposed. This method divides the design task into the Specification Optimization (behavioml) and Implementation Optimization (structural) phases. In an eaI’iieT pUpeT[~], poweTfui algoTiihms foI’ peI’forming specification optimization aTe pTesented. High peTfow mance is achieved via exploitation of fine-groin parallelism. The architecture design style uses a iemplate resembling a scalable VeTy Long Instruction Word (VLIW) pTocessoT. This papeT pTesents new a~goTithms foT performing implementation optimization, which map the optimized specification in the form of highly paTallelized code to eficient haTdwaTe imp~ementations. A scalable implementation template is used to constTain the implementation style. Graph coloTing algorithms aTe employed to pToduce the optimized implementations. The entire architecture synthesis pToceduTe has been implemented and applied to numeTous examples. Results on these examples are presented. Speedups in the range of ,2.6 to 7.7 oveT contemporary RISC processors have been obtained. The computation times needed foT the synthesis of these examples are on the oTder of a few seconds.","PeriodicalId":118572,"journal":{"name":"MICRO 24","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1991-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Implementation optimization techniques for architecture synthesis of application-specific processors\",\"authors\":\"M. Breternitz, John Paul Shen\",\"doi\":\"10.1145/123465.123488\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An architectu?’e synthesis method for the automated design of high-performance application-specific processors has been p?’oposed. This method divides the design task into the Specification Optimization (behavioml) and Implementation Optimization (structural) phases. In an eaI’iieT pUpeT[~], poweTfui algoTiihms foI’ peI’forming specification optimization aTe pTesented. High peTfow mance is achieved via exploitation of fine-groin parallelism. The architecture design style uses a iemplate resembling a scalable VeTy Long Instruction Word (VLIW) pTocessoT. This papeT pTesents new a~goTithms foT performing implementation optimization, which map the optimized specification in the form of highly paTallelized code to eficient haTdwaTe imp~ementations. A scalable implementation template is used to constTain the implementation style. Graph coloTing algorithms aTe employed to pToduce the optimized implementations. The entire architecture synthesis pToceduTe has been implemented and applied to numeTous examples. Results on these examples are presented. Speedups in the range of ,2.6 to 7.7 oveT contemporary RISC processors have been obtained. The computation times needed foT the synthesis of these examples are on the oTder of a few seconds.\",\"PeriodicalId\":118572,\"journal\":{\"name\":\"MICRO 24\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1991-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"MICRO 24\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/123465.123488\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"MICRO 24","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/123465.123488","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
摘要
一个architectu吗?提出了一种用于高性能专用处理器自动化设计的综合方法。该方法将设计任务分为规范优化(行为)和实现优化(结构)两个阶段。提出了一种用于“i’i’i”成形规格优化的高性能算法[~]。高流量性能是通过利用细腹股沟并行性实现的。架构设计风格使用类似于可扩展的VLIW (very Long Instruction Word) ptcessot的模板。本文提出了一种新的实现优化算法,将优化后的规范以高度并行化的代码形式映射到高效的hadoop实现中。可伸缩的实现模板用于包含实现样式。采用图形着色算法来生成优化的实现。整个体系结构综合教程已经被实现并应用到许多例子中。给出了这些算例的结果。已经获得了当代RISC处理器2.6到7.7的加速范围。综合这些例子所需的计算时间在几秒左右。
Implementation optimization techniques for architecture synthesis of application-specific processors
An architectu?’e synthesis method for the automated design of high-performance application-specific processors has been p?’oposed. This method divides the design task into the Specification Optimization (behavioml) and Implementation Optimization (structural) phases. In an eaI’iieT pUpeT[~], poweTfui algoTiihms foI’ peI’forming specification optimization aTe pTesented. High peTfow mance is achieved via exploitation of fine-groin parallelism. The architecture design style uses a iemplate resembling a scalable VeTy Long Instruction Word (VLIW) pTocessoT. This papeT pTesents new a~goTithms foT performing implementation optimization, which map the optimized specification in the form of highly paTallelized code to eficient haTdwaTe imp~ementations. A scalable implementation template is used to constTain the implementation style. Graph coloTing algorithms aTe employed to pToduce the optimized implementations. The entire architecture synthesis pToceduTe has been implemented and applied to numeTous examples. Results on these examples are presented. Speedups in the range of ,2.6 to 7.7 oveT contemporary RISC processors have been obtained. The computation times needed foT the synthesis of these examples are on the oTder of a few seconds.