{"title":"Combined functional partitioning and communication speed selection for networked voltage-scalable processors","authors":"N. Bagherzadeh, P. Chou, Jinfeng Liu","doi":"10.1145/581199.581205","DOIUrl":"https://doi.org/10.1145/581199.581205","url":null,"abstract":"This paper presents a new technique for global energy optimization through coordinated functional partitioning and speed selection for embedded processors interconnected by a high-speed serial bus. Many such serial interfaces are capable of operating at multiple speeds and can open up a new dimension of trade-offs to complement today's CPU-centric voltage scaling techniques for processors. We propose a multi-dimensional dynamic programming formulation for energy-optimal functional partitioning with CPU/communication speed selection for a class of data-regular applications under performance constraints. We demonstrate the effectiveness of our optimization techniques with an image processing application mapped onto a multi-processor architecture with a multi-speed Ethernet.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116413356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Formal verification in a component-based reuse methodology","authors":"P. Eles, Zebo Peng, D. Karlsson","doi":"10.1145/581199.581235","DOIUrl":"https://doi.org/10.1145/581199.581235","url":null,"abstract":"There is an important trend towards design processes based on the reuse of predesigned components. We propose a formal verification approach which smoothly integrates with a component based system-level design methodology. Once a timed Petri net model corresponding to the interface logic has been produced the correctness of the system can be formally verified. The verification is based on the interface properties of the connected components and on abstract models of their functionality, without assuming any knowledge regarding their implementation. We have both developed the theoretical framework underlying the methodology and implemented an experimental environment using model checking techniques.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115304410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Low-power data memory communication for application-specific embedded processors","authors":"A. Orailoglu, Peter Petrov","doi":"10.1145/581199.581248","DOIUrl":"https://doi.org/10.1145/581199.581248","url":null,"abstract":"We propose a novel customization methodology for power reduction on the communication link between an embedded processor and its data memory. We target the address bus and show how by utilizing application information about the memory references in the data intensive program loops, a power efficient address communication protocol can be established between the processor core and the data memory. The data memory controller thus generates the addresses for the various data streams with minimal run-time information from the processor engine, achieving significant power reductions on the address bus. An efficient reprogrammable hard-ware ware support is presented for enabling the proposed methodology. The experimental results demonstrate the efficacy of the approach for a set of data intensive applications.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114239962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Round-robin Arbiter Design and Generation","authors":"V. Mooney, G. Riley, Eung S. Shin","doi":"10.1145/581199.581253","DOIUrl":"https://doi.org/10.1145/581199.581253","url":null,"abstract":"In this paper, we introduce a Round-robin Arbiter Generator (RAG) tool. The RAG tool can generate a design for a Bus Arbiter (BA). The BA is able to handle the exact number of bus masters for both on-chip and off-chip buses. RAG can also generate a distributed and parallel hierarchical Switch Arbiter (SA). The first contribution of this paper is the automated generation of a round-robin token passing BA to reduce time spent on arbiter design. The generated arbiter is fair, fast, and has a low and predictable worst-case wait time. The second contribution of this paper is the design and integration of a distributed fast arbiter, e.g., for a terabit switch, based on 2/spl times/2 and 4/spl times/4 switch arbiters (SAs). Using a .25/spl mu/ TSMC standard cell library from LEDA Systems [10, 14], we show the arbitration time of a 256/spl times/256 SA for a terabit switch and demonstrate that the SA generated by RAG meets the time constraint to achieve approximately six terabits of throughput in a typical network switch design. Furthermore, our generated SA performs better than the Ping-Pong Arbiter and Programmable Priority Encoder by a factor of 1.9/spl times/ and 2.4/spl times/, respectively.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128058586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. C. López, Fernando Rincón Calle, F. Moya, J.M. Moya
{"title":"Improving embedded system design by means of HW-SW compilation on reconfigurable coprocessors","authors":"J. C. López, Fernando Rincón Calle, F. Moya, J.M. Moya","doi":"10.1145/581199.581255","DOIUrl":"https://doi.org/10.1145/581199.581255","url":null,"abstract":"This article describes a new approach to HW-SW codesign for complex embedded systems, using high-level programming languages. Unlike in previous approaches, the designer does not need to acquire new skills, because most of the design process is automated. The hardware extensions are implemented as simple coprocessors consisting of a reconfigurable datapath and a control memory. Our approach is demonstrated with a simple image processing application, obtaining a 100% performance improvement.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"322 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123403146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Validation in a component-based design flow for multicore SoCs","authors":"A. Jerraya, S. Yoo, A. Bouchhima, G. Nicolescu","doi":"10.1145/581199.581236","DOIUrl":"https://doi.org/10.1145/581199.581236","url":null,"abstract":"Currently, since many SoCs include heterogeneous components such as CPUs, DSPs, ASICs, memories, buses, etc., system integration becomes a major step in the design flow. To enable this integration, we use a design approach called component based-design approach. In this approach, the validation of system integration takes most of design efforts. This paper presents an automatic method of SoCs design validation. Based on a generic simulation wrapper architecture, the presented method provides automatic generation of executable models throughout different stages of SoC design flow. A case study of validating a VDSL application shows the effectiveness of the method.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133830616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CMP on SoC: architect's view","authors":"S. Sakai","doi":"10.1145/581199.581222","DOIUrl":"https://doi.org/10.1145/581199.581222","url":null,"abstract":"This paper briefly sketches the current issues of chip multiprocessor (CMP) design for system LSI chip from the viewpoint of computer architects.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132173682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tuning of loop cache architectures to programs in embedded system design","authors":"F. Vahid, S. Cotterell","doi":"10.1145/581199.581204","DOIUrl":"https://doi.org/10.1145/581199.581204","url":null,"abstract":"Adding a small loop cache to a microprocessor has been shown to reduce average instruction fetch energy for various sets of embedded system applications. With the advent of core-based design, embedded system designers can now tune a loop cache architecture to best match a specific application. We developed an automated simulation environment to find the best loop cache architecture for a given application and technology. Using this environment, we show significant variation in the best architecture for different examples. The results support the need for future fast synthesis of tuned loop cache architectures.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"2012 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133364225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Energy/power estimation of regular processor arrays","authors":"S. Rajopadhye, Steven Derrien","doi":"10.1145/581199.581212","DOIUrl":"https://doi.org/10.1145/581199.581212","url":null,"abstract":"We propose a high-level analytical model for estimating the energy and/or power dissipation in VLSI processor (systolic) array implementations of loop programs, particularly for implementations on FPGA based CO-processors. We focus on the respective impact of the array design parameters on the overall off-chip I/O traffic and the number and sizes of the local memories in the array. The model is validated experimentally and shows good results (12.7% RMS error in the predictions).","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123789901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Jerraya, D. Lyonnard, S. Meftali, F. Rousseau, F. Gharsalli
{"title":"Unifying memory and processor wrapper architecture in multiprocessor SoC design","authors":"A. Jerraya, D. Lyonnard, S. Meftali, F. Rousseau, F. Gharsalli","doi":"10.1145/581199.581207","DOIUrl":"https://doi.org/10.1145/581199.581207","url":null,"abstract":"In this paper, we present a new methodology for application specific multiprocessor system-on-chip design. this approach facilitates the integration of existing components with the concept of wrapper. Wrappers allow automatic adaptation of physical interfaces to a communication network. We also give a generic architecture to produce these wrappers, either for processors or for other specific components such as memory IP. This approach has successfully been applied on a low-level image processing application.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122308370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}