T. Aa, P. Raghavan, S. Mahlke, B. D. Sutter, Aviral Shrivastava, Frank Hannig
{"title":"Embedded tutorial — Compilation techniques for CGRAs: Exploring all parallelization approaches","authors":"T. Aa, P. Raghavan, S. Mahlke, B. D. Sutter, Aviral Shrivastava, Frank Hannig","doi":"10.1145/1878961.1878995","DOIUrl":"https://doi.org/10.1145/1878961.1878995","url":null,"abstract":"Coarse-Grained Reconfigurable Array (CGRA) processors accelerate inner loops of applications by exploiting instructionlevel parallelism (ILP) and in some cases also data-level and task-level parallelism (DLP & TLP). The aim of this tutorial is to give insight in CGRA architectures and their compilation techniques to exploit parallelism. These topics will be covered: · Polymorphic pipeline arrays, expanding coarse-grained arrays beyond innermost loops (Scott Mahlke, University of Michigan) · Code-generation for coarse-grained arrays: flexibility and programmer productivity (Bjorn De Sutter, Ghent University) · Memory-aware compilation techniques for CGRAs (Aviral Shrivastava, Arizona State University) · Retargetable Mapping of Loop Programs on Coarse-grained Reconfigurable Arrays (Frank Hannig, University of Erlangen-Nuremberg).","PeriodicalId":118816,"journal":{"name":"2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130652146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Wiggers, L. Thiele, Edward A. Lee, S. Schliecker, M. Bekooij
{"title":"Modeling and analyzing real-time multiprocessor systems","authors":"M. Wiggers, L. Thiele, Edward A. Lee, S. Schliecker, M. Bekooij","doi":"10.1145/1878961.1879019","DOIUrl":"https://doi.org/10.1145/1878961.1879019","url":null,"abstract":"Researchers have proposed approaches to verify that real-time multiprocessor systems meet their timeliness constraints. These approaches make assumptions on the model of computation, the load placed on the multiprocessor system, and the faults that can arise. This heterogeneous set of assumptions make these approaches hard to compare. This tutorial will present an overview and positioning of four recently proposed approaches. We present for each approach, the application domain in which its assumptions are realistic and the dominant application requirements that have driven its development. Next to discussing timeliness guarantees, we give attention to the robustness aspects of each approach; e.g. against faults such as overload.","PeriodicalId":118816,"journal":{"name":"2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114690414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kim Grüttner, Henning Kleen, F. Oppenheimer, A. Rettberg, W. Nebel
{"title":"Towards a synthesis semantics for SystemC channels","authors":"Kim Grüttner, Henning Kleen, F. Oppenheimer, A. Rettberg, W. Nebel","doi":"10.1145/1878961.1878990","DOIUrl":"https://doi.org/10.1145/1878961.1878990","url":null,"abstract":"In this paper we propose a synthesis semantics for SystemC™ channels, which contribute to a clear separation between computation (algorithm) and communication, whereas communication related parts are modelled through either primitive or hierarchical channels. We present a synthesisable replacement for SystemC primitive channels that allows deterministic access scheduling and user-constrained refinement for HW/HW and HW/SW communication. We demonstrate the feasibility of our approach through synthesis and exploration of a communication intensive packet switch design under consideration of different configurations and communication refinements.","PeriodicalId":118816,"journal":{"name":"2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115937293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tor E. Jeremiassen, Tim Kogel, A. Takach, G. Martin, A. Donlin, K. Chatha
{"title":"From ESL 2010 to ESL 2015","authors":"Tor E. Jeremiassen, Tim Kogel, A. Takach, G. Martin, A. Donlin, K. Chatha","doi":"10.1145/1878961.1878973","DOIUrl":"https://doi.org/10.1145/1878961.1878973","url":null,"abstract":"In 2010, a wave of consolidation swept over the Electronic System Level (ESL) design industry. It brought ESL providers together with mainstream EDA houses and created opportunities for new ESL ventures. This paper contains short summaries of presentations in a special session focusing on the future of ESL. The session has two goals: the first is to present the state of the art in ESL tools and practice and, second, share a vision of the technical challenges that the next generation of ESL companies should address. The session includes a mix of perspectives from both ESL solution vendors and end-users and touches all all four ESL use cases: software virtual platforms, performance analysis, high level synthesis and verification.","PeriodicalId":118816,"journal":{"name":"2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133271657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A greedy buffer allocation algorithm for power-aware communication in body sensor networks","authors":"Hassan Ghasemzadeh, R. Jafari","doi":"10.1145/1878961.1878998","DOIUrl":"https://doi.org/10.1145/1878961.1878998","url":null,"abstract":"Monitoring human movements using wireless sensory devices promises to revolutionize the delivery of healthcare services. In spite of their potentials for many application domains, power requirements and wearability have limited the commercialization of these systems. In this paper, we propose a novel approach for optimizing communication energy by reducing inter-node data transmissions. This is accomplished by introducing buffers that limit communication to short bursts, and therefore decrease power usage and simplify the communication. Our buffer allocation is a greedy algorithm that can operate both in a centralized and distributed architecture. We experimentally demonstrate the effectiveness of our power reduction techniques. Our results show that, compared with an unbuffered system, our system achieves more than 30% reduction in energy consumption.","PeriodicalId":118816,"journal":{"name":"2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114462700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OE+IOE: A novel turn model based fault tolerant routing scheme for networks-on-chip","authors":"S. Pasricha, Yong Zou, D. Connors, H. Siegel","doi":"10.1145/1878961.1878979","DOIUrl":"https://doi.org/10.1145/1878961.1878979","url":null,"abstract":"Network-on-chip (NoC) communication architectures are increasingly being used today to interconnect cores on chip multiprocessors (CMPs). Permanent faults in NoCs due to fabrication challenges in ultra deep submicron (UDSM) technology nodes and due to wearout have led to an increased emphasis on fault tolerant design techniques. To ensure fault tolerant communication in NoCs, several fault tolerant routing algorithms have been proposed in recent years with the goal of routing flits around faults. A majority of these algorithms are based on the turn model approach due to its simplicity and inherent freedom from deadlock. However, existing turn model based fault tolerant routing algorithms are either too restrictive in the choice of paths that flits can traverse, or are tailored to work efficiently only on very specific fault distribution patterns. In this paper, we propose a novel low overhead fault tolerant routing scheme that combines the odd-even (OE) and inverted odd-even (IOE) turn models to achieve much better fault tolerance than traditional turn model based schemes. The proposed scheme uses replication opportunistically to optimize the balance between energy overhead and arrival rate. Our experimental results indicate that the proposed OE+IOE routing scheme provides better fault tolerance than existing turn model, N-random walk, and dual virtual channel based routing schemes that have been proposed in literature.","PeriodicalId":118816,"journal":{"name":"2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115056394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Worst-case performance analysis of Synchronous Dataflow scenarios","authors":"M. Geilen, S. Stuijk","doi":"10.1145/1878961.1878985","DOIUrl":"https://doi.org/10.1145/1878961.1878985","url":null,"abstract":"Synchronous Dataflow (SDF) is a powerful analysis tool for regular, cyclic, parallel task graphs. The behaviour of SDF graphs however is static and therefore not always able to accurately capture the behaviour of modern, dynamic dataflow applications, such as embedded multimedia codecs. An approach to tackle this limitation is by means of scenarios. In this paper we introduce a technique and a tool to automatically analyse a scenario-aware dataflow model for its worst-case performance. A system is specified as a collection of SDF graphs representing individual scenarios of behaviour and a finite state machine that specifies the possible orders of scenario occurrences. This combination accurately captures more dynamic applications and this way provides tighter results than an existing analysis based on a conservative static dataflow model, which is too pessimistic, while looking only at the `worst-case' individual scenario, without considering scenario transitions, can be too optimistic. We introduce a formal semantics of the model, in terms of (max; +) linear system-theory and in particular (max; +) automata. Leveraging existing results and algorithms from this domain, we give throughput analysis and state space generation algorithms for worst-case performance analysis. The method is implemented in a tool and the effectiveness of the approach is experimentally evaluated.","PeriodicalId":118816,"journal":{"name":"2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"1984 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130621857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FEMU: A firmware-based emulation framework for SoC verification","authors":"Hao Li, Dong Tong, Kan Huang, Xu Cheng","doi":"10.1145/1878961.1879007","DOIUrl":"https://doi.org/10.1145/1878961.1879007","url":null,"abstract":"Full-system emulation on FPGA(Field-Programmable Gate Array) with real-world workloads can enhance the confidence of SoC(System-on-Chip) design. However, since FPGA emulation requires complete implementation of key modules and provides weak visibility, it is time-consuming. This paper proposes FEMU, a hybrid firmware/hardware emulation framework for SoC verification. The core of FEMU is implemented by transplanting QEMU, a full-system emulator, from OS level to BIOS level, so we can directly emulate devices upon hardware. Moreover, FEMU provides programming interfaces to simplify device modeling in firmware. Based on an auxiliary set of hardware modules, FEMU allows hybrid full-system emulation with the combination of real hardware and emulated firmware model. Therefore, FEMU can facilitate full-system emulation in three aspects. First, FEMU enables full-system emulation with the minimum hardware implementation, so the DUT (Design Under Test) module can be verified under target application as early as possible. Second, by comparing the execution traces generated using real hardware and emulated firmware model, respectively, FEMU helps locate and fix bugs occurred in the full-system emulation. Third, by replacing un-verified hardware modules with emulated firmware models, FEMU helps isolating design issues in multiple modules. In a practical SoC project, FEMU helped us identify several design issues in full-system emulation. In addition, the evaluation results show that the emulation speed of FEMU is comparable with QEMU after transplantation.","PeriodicalId":118816,"journal":{"name":"2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128592229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A case for lifetime-aware task mapping in embedded chip multiprocessors","authors":"Adam S. Hartman, D. E. Thomas, B. Meyer","doi":"10.1145/1878961.1878987","DOIUrl":"https://doi.org/10.1145/1878961.1878987","url":null,"abstract":"Temperature-aware design is emerging as a popular approach to addressing a variety of challenges, including system lifetime. In the case of task mapping, temperature-aware approaches indeed improve lifetime due to lifetime's strong dependence on tempera-ture. However, temperature-aware design neglects several important factors that also influence lifetime: (a) physical parameters such as supply voltage and current density, as well as (b) application and architecture characteristics that affect what failures are survivable. Only lifetime-aware task mapping can expose the relationship between physical parameters, component failure, and system lifetime, and therefore find lifetime-optimal mappings. To address this need, we have developed a new lifetime-aware task mapping technique based on ant colony optimization (ACO). Our technique produces task mappings resulting in lifetimes with-in 17.9% of the observed optimal results on average, outperform-ing a lifetime-agnostic task mapping approach by an average of 32.3%. We also observed that the lifetimes resulting from task mappings within 1% of the best maximum system temperature vary by an average of 20.1% while the lifetimes resulting from task mappings within 1% of the best average system temperature vary by an average of 32.6%. Our observations lead us to conclude that one cannot depend on temperature-aware task mapping when system lifetime is a design constraint, but one may depend on lifetime-aware task mapping when one or both of lifetime and temperature are design constraints.","PeriodicalId":118816,"journal":{"name":"2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121419311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Verification of dynamically reconfigurable embedded systems by model transformation rules","authors":"F. Madlener, J. Weingart, S. Huss","doi":"10.1145/1878961.1878969","DOIUrl":"https://doi.org/10.1145/1878961.1878969","url":null,"abstract":"This paper describes a methodology for the verification of reconfigurable embedded systems. The reconfigurable systems are described by means of the Reconfigurable Discrete Event Specified System (RecDEVS) computational model and the verification is performed by a model transformation from the RecDEVS model into an equivalent representation for the UPPAAL model checking methodology. We introduce an algorithm for the automatic transformation of such models, which originate from disjoint application domains. This allows the usage of an state-of-the art verification tool for the verification of arbitrary properties of system specifications denoted in RecDEVS. We also present a set of important system properties, which now may be verified. This set includes some fundamental reconfiguration domain specific properties, which were not addressed by previous formal verification methods. The feasibility of this approach is demonstrated for a complex automotive application.","PeriodicalId":118816,"journal":{"name":"2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129036947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}