International Conference on Compilers, Architecture, and Synthesis for Embedded Systems最新文献_第6页

Fine-grain dynamic instruction placement for L0 scratch-pad memory 用于L0刮刮板存储器的细粒度动态指令放置

International Conference on Compilers, Architecture, and Synthesis for Embedded Systems Pub Date : 2010-10-24 DOI: 10.1145/1878921.1878943

Jongsoo Park, J. Balfour, W. Dally

引用次数: 9

Parsimonious information technologies for pixels, perception, wetware and simulation: issues for Petrasek's global virtual hospital system 像素、感知、湿软件和模拟的简约信息技术:Petrasek全球虚拟医院系统的问题

International Conference on Compilers, Architecture, and Synthesis for Embedded Systems Pub Date : 2010-10-24 DOI: 10.1145/1878921.1878931

A. Barr

{"title":"Parsimonious information technologies for pixels, perception, wetware and simulation: issues for Petrasek's global virtual hospital system","authors":"A. Barr","doi":"10.1145/1878921.1878931","DOIUrl":"https://doi.org/10.1145/1878921.1878931","url":null,"abstract":"New types of \"engaging\" embedded systems and devices will greatly assist future medical care, as for Petrasek's envisioned Global Virtual Hospital System. The most effective devices will need to be designed in a \"parsimonious\" way for their economic use of energy, digital bits, communication time, and in terms of trading more expensive physical structures for less expensive computational ones. At the technological level, each device needs a carefully selected \"matched set\" of technological tradeoffs between the particular medical and user ends and means. The matched set of choices would carefully make sure that the device \"methods\" and implementations lead reliably to the device \"goals\" and purposes.\u0000 In addition, however, there is a critical user-oriented aspect where the devices will also need to utilize highly \"engaging environments\" that are not too cumbersome or too tiring to use. People are becoming increasingly sophisticated with regard to the interactive requirements they have for their devices, from their experience with digital media, iPhones, video computer games and other types of environments that \"engage\" a person's attention for long periods of time, and without annoying delays and frustrations.\u0000 It is an absolute requirement that the devices incorporate highly engaging environments so that using them does not tire the user or cause unnecessary medical errors and delays.\u0000 This improved type of portable device, scanners, services and information methods would efficiently and more accurately gather sufficiently detailed medical information from the patient's body, help relay sufficient parts of the patient information electronically to a worldwide net of physicians and relay appropriate results and prescriptions back to the patient","PeriodicalId":136293,"journal":{"name":"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124689974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Vertical stealing: robust, locality-aware do-all workload distribution for 3D MPSoCs 垂直窃取:用于3D mpsoc的健壮、位置感知的所有工作负载分配

International Conference on Compilers, Architecture, and Synthesis for Embedded Systems Pub Date : 2010-10-24 DOI: 10.1145/1878921.1878952

A. Marongiu, P. Burgio, L. Benini

引用次数: 6

Slicing based code parallelization for minimizing inter-processor communication 基于切片的代码并行化最小化处理器间通信

International Conference on Compilers, Architecture, and Synthesis for Embedded Systems Pub Date : 2009-10-11 DOI: 10.1145/1629395.1629409

M. Kandemir, Yuanrui Zhang, Sai Prashanth Muralidhara, O. Ozturk, S. Narayanan

{"title":"Slicing based code parallelization for minimizing inter-processor communication","authors":"M. Kandemir, Yuanrui Zhang, Sai Prashanth Muralidhara, O. Ozturk, S. Narayanan","doi":"10.1145/1629395.1629409","DOIUrl":"https://doi.org/10.1145/1629395.1629409","url":null,"abstract":"One of the critical problems in distributed memory multi-core architectures is scalable parallelization that minimizes inter-processor communication. Using the concept of iteration space slicing, this paper presents a new code parallelization scheme for data-intensive applications. This scheme targets distributed memory multi-core architectures, and formulates the problem of data-computation distribution (partitioning) across parallel processors using slicing such that, starting with the partitioning of the output arrays, it iteratively determines the partitions of other arrays as well as iteration spaces of the loop nests in the application code. The goal is to minimize inter-processor data communications. Based on this iteration space slicing based formulation of the problem, we also propose a solution scheme. The proposed data-computation scheme is evaluated using six data-intensive benchmark programs. In our experimental evaluation, we also compare this scheme against three alternate data-computation distribution schemes. The results obtained are very encouraging, indicating around 10% better speedup, with 16 processors, over the next-best scheme when averaged over all benchmark codes we tested.","PeriodicalId":136293,"journal":{"name":"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121343233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A platform for developing adaptable multicore applications 用于开发可适应的多核应用程序的平台

International Conference on Compilers, Architecture, and Synthesis for Embedded Systems Pub Date : 2009-10-11 DOI: 10.1145/1629395.1629418

D. Fay, L. Shang, D. Grunwald

{"title":"A platform for developing adaptable multicore applications","authors":"D. Fay, L. Shang, D. Grunwald","doi":"10.1145/1629395.1629418","DOIUrl":"https://doi.org/10.1145/1629395.1629418","url":null,"abstract":"Computer systems are resource constrained. Application adaptation is a useful way to optimize system resource usage while satisfying the application performance constraints. Previous application adaptation efforts, however, were ad-hoc, time-consuming, and highly application-specific with limited portability between computer systems. In this work, our goal is to provide a development platform to systematically explore and rigorously apply portable application-specific runtime optimization. We present OCCAM, a software platform for developing multicore adaptive applications. OCCAM's design-time platform consists of APIs and data structures that allow application developers to specify the performance constraints and application-specific optimization techniques. OCCAM's run-time system dynamically manages the application behavior and optimizes system resource usage. OCCAM targets emerging Recognition, Mining, and Synthesis Applications (RMS). Using a set of RMS benchmarks, the experimental study demonstrates that OCCAM can successfully optimize resource usage under application performance constraints across a wide range of computer platforms, with an average of 38% energy savings on an Intel Atom-based, energy-constrained portable system, and an average of 24% energy savings on a high-performance, dual-core computer platform. These savings are accomplished with low overhead. We have also successfully extended OCCAM applications to run on a 16-core setup.","PeriodicalId":136293,"journal":{"name":"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127176714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Optimal loop parallelization for maximizing iteration-level parallelism 优化循环并行化最大化迭代级并行

International Conference on Compilers, Architecture, and Synthesis for Embedded Systems Pub Date : 2009-10-11 DOI: 10.1145/1629395.1629407

Duo Liu, Z. Shao, M. Wang, M. Guo, Jingling Xue

引用次数: 18

Spatial complexity of reversibly computable DAG 可逆可计算DAG的空间复杂度

International Conference on Compilers, Architecture, and Synthesis for Embedded Systems Pub Date : 2009-10-11 DOI: 10.1145/1629395.1629404

Mouad Bahi, C. Eisenbeis

{"title":"Spatial complexity of reversibly computable DAG","authors":"Mouad Bahi, C. Eisenbeis","doi":"10.1145/1629395.1629404","DOIUrl":"https://doi.org/10.1145/1629395.1629404","url":null,"abstract":"In this paper we address the issue of making a program reversible in terms of spatial complexity. Spatial complexity is the amount of memory/register locations required for performing the computation in both forward and backward directions. Spatial complexity has important relationship with the intrinsics power consumption required at run time; this was our primary motivation. But it has also important relationship with the trade off between storing or recomputing reused intermediate values, also known as the rematerialization problem in the context of compiler register allocation, or the checkpointing issue in the general case. We present a lower bound of the spatial complexity of a DAG (directed acyclic graph) with reversible operations, as well as a heuristic aimed at finding the minimum number of registers required for a forward and backward execution of a DAG . We define energetic garbage as the additional number of registers needed for the reversible computation with respect to the original computation. We have run experiments that suggest that the garbage size is never more than 50% of the DAG size for DAGs with unary/binary operations.","PeriodicalId":136293,"journal":{"name":"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114181183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Energy-aware probabilistic multiplier: design and analysis 能量感知概率乘法器:设计与分析

International Conference on Compilers, Architecture, and Synthesis for Embedded Systems Pub Date : 2009-10-11 DOI: 10.1145/1629395.1629434

Mark S. K. Lau, K. Ling, Y. Chu

{"title":"Energy-aware probabilistic multiplier: design and analysis","authors":"Mark S. K. Lau, K. Ling, Y. Chu","doi":"10.1145/1629395.1629434","DOIUrl":"https://doi.org/10.1145/1629395.1629434","url":null,"abstract":"Probabilistic CMOS is considered to be a promising technology for substantial energy savings for computing devices, such as DSPs and graphics chips. The basic principle is to relax the energy requirement by allowing possibly incorrect computation results. For devices with probabilistic components, energy should be assigned to each component wisely, in order to achieve a good trade-off between energy consumption and correctness of the outputs. Recently, a few schemes have been proposed for energy assignment of ripple-carry adders, which are often based on intuitive arguments. In the present paper, we extend the idea of energy assignment to probabilistic multipliers. We focus on a fundamental type of multipliers, known as array multipliers. We derive some analytical results. Guided by these results, we devise an energy assignment scheme. We also find that energy assignment for array multipliers and ripple-carry adders can be quite different, due to differences in their structures. To our best knowledge, our work here is the first attempt in the literature to consider energy assignment for multipliers. Some examples, including digital image enhancement, are presented to demonstrate the effectiveness of the proposed scheme.","PeriodicalId":136293,"journal":{"name":"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems","volume":"407 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121811898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 83

A fault tolerant cache architecture for sub 500mV operation: resizable data composer cache (RDC-cache) 用于低于500mV操作的容错缓存架构:可调整大小的数据编写器缓存(RDC-cache)

International Conference on Compilers, Architecture, and Synthesis for Embedded Systems Pub Date : 2009-10-11 DOI: 10.1145/1629395.1629431

Avesta Sasan, H. Homayoun, A. Eltawil, F. Kurdahi

{"title":"A fault tolerant cache architecture for sub 500mV operation: resizable data composer cache (RDC-cache)","authors":"Avesta Sasan, H. Homayoun, A. Eltawil, F. Kurdahi","doi":"10.1145/1629395.1629431","DOIUrl":"https://doi.org/10.1145/1629395.1629431","url":null,"abstract":"In this paper we introduce Resizable Data Composer-Cache (RDC-Cache). This novel cache architecture operates correctly at sub 500 mV in 65 nm technology tolerating large number of Manufacturing Process Variation induced defects. Based on a smart relocation methodology, RDC-Cache decomposes the data that is targeted for a defective cache way and relocates one or few word to a new location avoiding a write to defective bits. Upon a read request, the requested data is recomposed through an inverse operation. For the purpose of fault tolerance at low voltages the cache size is reduced, however, in this architecture the final cache size is considerably higher compared to previously suggested resizable cache organizations [2][3]. The following three features a) compaction of relocated words, b)ability to use defective words for fault tolerance and c) \"linking\" (relocating the defective word to any row in the next bank), allows this architecture to achieve far larger fault tolerance in comparison to [2][3]. In high voltage mode, the fault tolerant mechanism of RDC-Cache is turned-off with minimal (0.91%) latency overhead compared to a traditional cache.","PeriodicalId":136293,"journal":{"name":"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114056710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 40

Tight WCRT analysis of synchronous C programs 严密的WCRT同步C程序分析

International Conference on Compilers, Architecture, and Synthesis for Embedded Systems Pub Date : 2009-10-11 DOI: 10.1145/1629395.1629424

P. Roop, Sidharta Andalam, R. V. Hanxleden, S. Yuan, C. Traulsen

{"title":"Tight WCRT analysis of synchronous C programs","authors":"P. Roop, Sidharta Andalam, R. V. Hanxleden, S. Yuan, C. Traulsen","doi":"10.1145/1629395.1629424","DOIUrl":"https://doi.org/10.1145/1629395.1629424","url":null,"abstract":"Accurate estimation of the tick length of a synchronous program is essential for efficient and predictable implementations that are devoid of timing faults. The techniques to determine the tick length statically are classified as worst case reaction time (WCRT) analysis. While a plethora of techniques exist for worst case execution time (WCET) analysis of procedural programs, there are only a handful of techniques for determining the WCRT value of synchronous programs. Most of these techniques produce overestimates and hence are unsuitable for the design of systems that are predictable while being also efficient. In this paper, we present an approach for the accurate estimation of the exact WCRT value of a synchronous program, called its tight WCRT value, using model checking. For our input specifications we have selected a synchronous C based language called PRET-C that is designed for programming Precision Timed (PRET) architectures. We then present an approach for static WCRT analysis of these programs via an intermediate format called TCCFG. This intermediate representation is then compiled to produce the input for the model checker.\u0000 Experimental results that compare our approach to existing approaches demonstrate the benefits of the proposed approach. The proposed approach, while presented for PRET-C is also applicable for WCRT analysis of Esterel using simple adjustments to the generated model. The proposed approach thus paves the way for a generic approach for determining the tight WCRT value of synchronous programs at compile time.","PeriodicalId":136293,"journal":{"name":"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115774391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 45