T. Nakata, Akio Matsuda, M. Shoji, S. Kuwamura, Qiang Zhu
{"title":"An object-oriented design process for system-on-chip using UML","authors":"T. Nakata, Akio Matsuda, M. Shoji, S. Kuwamura, Qiang Zhu","doi":"10.1145/581199.581254","DOIUrl":"https://doi.org/10.1145/581199.581254","url":null,"abstract":"The object-oriented design process has been a hot topic in software development since it will improve product quality and productivity significantly, which is also a major issue in system-on-chip design. In this paper, a design process is proposed for hardware-software heterogeneous systems by reinforcing parallelism, structure, and timing. The management of design abstraction is also introduced for refinement of hardware. UML is used as a modeling language, and the reinforcement above is gracefully integrated into UML by its extensibility mechanism. An example of architecture exploration and performance analysis is illustrated through the application of the process to an image decoding design.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124771181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
L. Lavagno, M. Lazarescu, S. Quer, Sergio Nocco, C. Passerone, G. Cabodi
{"title":"A symbolic approach for the combined solution of scheduling and allocation","authors":"L. Lavagno, M. Lazarescu, S. Quer, Sergio Nocco, C. Passerone, G. Cabodi","doi":"10.1145/581199.581252","DOIUrl":"https://doi.org/10.1145/581199.581252","url":null,"abstract":"Scheduling is widely recognized as a very important step in high-level synthesis. Nevertheless, it is usually done without taking into account the effects on the actual hardware implementation. This paper presents an efficient symbolic technique to concurrently integrate operation scheduling and resource allocation. The technique inherits all the features of \"standard\" BDD-based control dominated scheduling, including resource-constraining, speculation and pruning. In addition, it introduces an efficient way of encoding allocation information within a symbolic scheduling automaton with a two-folded target. Firstly, it finds a minimum cost allocation of operation resources satisfying a given schedule. Secondly, it optimizes the amount of registers required to store intermediate results of operations. Theory and algorithms are developed and presented. Experimental results on a well known set of benchmarks show the potentiality of the approach.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"268 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115968419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A run-time word-level reconfigurable coarse-grain functional unit for a VLIW processor","authors":"Carles Rodoreda Sala, N. Busá","doi":"10.1145/581199.581211","DOIUrl":"https://doi.org/10.1145/581199.581211","url":null,"abstract":"Nowadays, new DSP applications are offering combined and flexible multimedia and telecom services. VLIW processor architectures, which include dedicated but inflexible functional units, are usually tuned to a single specific application. In order to accelerate a wide range of applications, we propose a VLIW processor containing a novel run-time reconfigurable functional unit (RC-FU). Only a few hundred bits and few cycles are necessary to configure a new coarse-grain operation on the RC-FU unit. After reconfiguring its internal datapath and microprogram, the RC-FU can execute a number of look-alike DSP functions, such as 8-point DCT or 4-point FFT. The RC-FU itself is a VLIW processor and the configuration contexts are generated using a high-level synthesis tool. The proposed RC-FU provides high processing power and can be efficiently tuned to the requirements of a variety of DSP applications.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122869952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Balakrishnan, Anshul Kumar, P. Ienne, Anup Gangwar, Bhuvan Middha
{"title":"A Trimaran based framework for exploring the design space of VLIW ASIPs with coarse grain functional units","authors":"M. Balakrishnan, Anshul Kumar, P. Ienne, Anup Gangwar, Bhuvan Middha","doi":"10.1145/581199.581203","DOIUrl":"https://doi.org/10.1145/581199.581203","url":null,"abstract":"It is widely accepted that use of an Application Specific Instruction Set Processor (ASIP) in an embedded system can provide a solution which is much more flexible than ASICs and much more efficient than standard processors in terms of performance and power consumption. However a lack of an acceptable design methodology and supporting tools for ASIPs limits their use even today. We present in this paper a methodology for design space exploration of high performance VLIW ASIPs by modeling Application Specific Functional Units in Trimaran Compiler Infrastructure. To demonstrate the effectiveness of our strategy we consider two important applications FFT and Kalman Filter and perform compute intensive operations in these applications via special Functional Units. The results we obtain are very promising with up to 2/spl times/ speed improvement.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114329233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A new performance evaluation approach for system level design space exploration","authors":"C. P. Joshi, Anshul Kumar, M. Balakrishnan","doi":"10.1145/581199.581239","DOIUrl":"https://doi.org/10.1145/581199.581239","url":null,"abstract":"Application specific systems have potential for customization of design with a view to achieve a better cost-performance-power trade-off. Such customization requires extensive design space exploration. In this paper, we introduce a performance evaluation methodology for system-level design exploration that is much faster than traditional cycle-accurate simulation. The trade off is between accuracy and simulation speed. The methodology is based on probabilistic modeling of system components customized with application behavior. Performance numbers are generated by simulating these models. We have implemented our models using SystemC and validated these for uni-processor as well as multiprocessor systems against various benchmarks.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"8 9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130527506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A case study of hardware and software synthesis in ForSyDe","authors":"I. Sander, A. Jantsch, Zhonghai Lu","doi":"10.1145/581199.581219","DOIUrl":"https://doi.org/10.1145/581199.581219","url":null,"abstract":"ForSyDe (FORmal SYstem DEsign) is a methodology which addresses the design of SoC applications which may contain control as well as data flow dominated parts. Starting with a formal system specification, which captures the functionality of the system, it provides refinement methods inside the functional domain to transform the abstract specification into an efficient implementation model which serves as a starting point for synthesis into hardware and software. In this paper we illustrate with a case study of a digital equalizer how a ForSyDe model can be synthesized into a hardware, a software or a combined hardware/software implementation.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114203991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Chatterjee, P. Ellervee, V. Mooney, Jun-Cheol Park, Kyu-won Choi, Kiran Puttaswamy
{"title":"System level power-performance trade-offs in embedded systems using voltage and frequency scaling of off-chip buses and memory","authors":"A. Chatterjee, P. Ellervee, V. Mooney, Jun-Cheol Park, Kyu-won Choi, Kiran Puttaswamy","doi":"10.1145/581199.581249","DOIUrl":"https://doi.org/10.1145/581199.581249","url":null,"abstract":"In embedded systems, off-chip buses and memory (i.e., L2 memory as opposed to the L1 memory which is usually on-chip cache) consume significant power often more than the processor itself. In this paper for the case of an embedded system with one processor chip and one memory chip, we propose frequency and voltage scaling of the off-chip buses and the memory chip and use a known micro-architectural enhancement called a store buffer to reduce the resulting impact on execution time. Our benchmarks show a system (processor + off-chip bus + off-chip memory) power savings of 28% to 36%, an energy savings of 13% to 35%, all while increasing the execution time in the range of 1% to 29%. Previous work in power-aware computing has focused on frequency and voltage scaling of the processors or selective power-down of sub-sets of off-chip memory chips. This paper quantitatively explores voltage/frequency scaling of off-chip buses and memory as a means of trading off performance for power/energy at the system level in embedded systems.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121181790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OpenMP: parallel programming API for shared memory multiprocessors and on-chip multiprocessors","authors":"M. Sato","doi":"10.1145/581199.581224","DOIUrl":"https://doi.org/10.1145/581199.581224","url":null,"abstract":"The OpenMP application programming interface is an emerging standard for parallel programming on shared-memory multiprocessors. Recently, OpenMP is attracting widespread interest because of its easy-to-use portable parallel programming model. In this paper, we describe a brief introduction of OpenMP API and its parallel programming. We present our Omni OpenMP complier and performance of some applications on a shared memory multiprocessor. In the end, a role of OpenMP for modern on-chip multiprocessors is discussed.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115315313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Balakrishnan, P. Marwedel, L. Wehmeyer, Nils Grunwald, R. Banakar, S. Steinke
{"title":"Reducing energy consumption by dynamic copying of instructions onto onchip memory","authors":"M. Balakrishnan, P. Marwedel, L. Wehmeyer, Nils Grunwald, R. Banakar, S. Steinke","doi":"10.1145/581199.581247","DOIUrl":"https://doi.org/10.1145/581199.581247","url":null,"abstract":"The number of mobile embedded systems is increasing and all of them are limited in their uptime by their battery capacity. Several hardware changes have been introduced during the last years, but the steadily growing functionality still requires further energy reductions, e.g. through software optimizations. A significant amount of energy can be saved in the memory hierarchy where most of the energy is consumed. In this paper, a new software technique is presented which supports the use of an onchip scratchpad memory by dynamically copying program parts into it. The set of selected program parts are determined with an optimal algorithm using integer linear programming. Experimental results show a reduction of the energy consumption by nearly 30%, a performance increase by 25% against a common cache system and energy improvements against a static approach of up to 38%.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"379 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116575294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Ha, Sungchan Kim, Chan-Eun Rhee, Hyunguk Jung, Youngmin Yi, Dohyung Kim
{"title":"Virtual synchronization for fast distributed cosimulation of dataflow task graphs","authors":"S. Ha, Sungchan Kim, Chan-Eun Rhee, Hyunguk Jung, Youngmin Yi, Dohyung Kim","doi":"10.1145/581199.581238","DOIUrl":"https://doi.org/10.1145/581199.581238","url":null,"abstract":"Fast distributed cosimulation is a challenging problem for the embedded system design. The main theme of this paper is to increase simulation speed by reducing the frequency of inter-simulator communications, reducing the active duration of simulators and utilizing the parallelism of component simulators, which is accomplished by combining event-driven and data-driven simulation methods. The proposed technique is applicable when the simulated tasks follow dataflow execution semantics. Experimental results show that the proposed technique can boost the cosimulation speed significantly compared with the previous conservative approaches.","PeriodicalId":413693,"journal":{"name":"15th International Symposium on System Synthesis, 2002.","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129472870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}