Jinkyu Jeong, Hwanju Kim, Jeaho Hwang, Joonwon Lee, S. Maeng
{"title":"DaaC: device-reserved memory as an eviction-based file cache","authors":"Jinkyu Jeong, Hwanju Kim, Jeaho Hwang, Joonwon Lee, S. Maeng","doi":"10.1145/2380403.2380439","DOIUrl":"https://doi.org/10.1145/2380403.2380439","url":null,"abstract":"Most embedded systems require contiguous memory space to be reserved for each device, which may lead to memory under-utilization. Although several approaches have been proposed to address this issue, they have limitations of either inefficient memory usage or long latency for switching the reserved memory space between a device and general-purpose uses.\u0000 Our scheme utilizes the reserved memory as an eviction-based file cache. It guarantees contiguous memory allocation to devices while providing idle device memory as an additional file cache called eCache for general-purpose usage. Since the eCache stores only evicted data from in-kernel page cache, memory efficiency is preserved and allocation time for devices is minimized. Cost-based region selection also alleviates additional read I/Os by carefully discarding cached data from the eCache. The prototype is implemented on the Nexus S smartphone and is evaluated with popular Android applications. The evaluation results show that 50%-85% of flash read I/Os are reduced and application launch performance is improved by 8%-16% while the reallocation time is limited to a few milliseconds.","PeriodicalId":136293,"journal":{"name":"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130326740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Z. Dyka, C. Walczyk, D. Walczyk, C. Wenger, P. Langendörfer
{"title":"Side channel attacks and the non volatile memory of the future","authors":"Z. Dyka, C. Walczyk, D. Walczyk, C. Wenger, P. Langendörfer","doi":"10.1145/2380403.2380413","DOIUrl":"https://doi.org/10.1145/2380403.2380413","url":null,"abstract":"In this paper, we describe a new non-volatile memory, based on metal-insulator-metal that provides performance benefits compared to standard Flash memory. In addition and more importantly, it comes with some advantages with respect to side channel attacks, i.e., its structure prevents by default optical analysis.","PeriodicalId":136293,"journal":{"name":"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125741582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Embedded reconfigurable architectures","authors":"Stephan Wong","doi":"10.1145/2380403.2380444","DOIUrl":"https://doi.org/10.1145/2380403.2380444","url":null,"abstract":"In current-day embedded systems design, one is faced with cut-throat competition to deliver new functionalities in increasingly shorter time frames. This is now achieved by incorporating processor cores into embedded systems through (re-)programmability. However, this is not always beneficial for the performance or energy consumption. Therefore, adaptable embedded systems have been proposed to deal with these negative effects by reconfiguring the critical sections of an embedded system. In these proposals, we are clearly witnessing a trend that is moving from static configurations to dynamic (re)configurations.\u0000 Consequently, the proposed embedded systems can adapt their functionality at run-time to meet the application(s) requirements (e.g., performance) while operating in different environments (e.g., power and hardware resources). Besides processor cores, we have to deal with memory hierarchies and network-on-chips that should also be (dynamically) reconfigurable. Furthermore, the interplay of these components is increasing the design complexity that can be only alleviated if they can self-optimize.\u0000 In this tutorial, we will present and discuss several strategies to perform the mentioned dynamic reconfiguration of the processor, memory, and NoC components - together with their interaction. We will review and present the state-of-the-art for the design of each component that allows for a gradual selection of design points in the trade-off between performance and power. Finally, we will highlight an open-source project that incorporates many approaches for dynamic reconfiguration in both actual hardware and simulation accompanied by the necessary tools.","PeriodicalId":136293,"journal":{"name":"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems","volume":"133 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127210216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic generation of hardware/software interfaces","authors":"Arvind","doi":"10.1145/2038698.2038700","DOIUrl":"https://doi.org/10.1145/2038698.2038700","url":null,"abstract":"Specialized hardware is necessary to reduce power consumption in mobile devices. Current design methodologies require an early partitioning of the application, allowing the hardware and software to be developed simultaneously, each adhering to a rigid interface contract. Early specification of detailed interface contracts is difficult and prevents the later migration of functionality across the interface. We address this problem using the Bluespec Codesign Language~(BCL) which permits the designer to specify the hardware-software partition in the source code, allowing the compiler to synthesize efficient software and hardware along with transactors for communication between the partitions. We will present preliminary results generated using our compiler for various hardware-software decompositions of several applications.","PeriodicalId":136293,"journal":{"name":"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128098493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Challenges for embedded multicore architecture","authors":"L. Carro, G. Gaydadjiev","doi":"10.1145/1878921.1878960","DOIUrl":"https://doi.org/10.1145/1878921.1878960","url":null,"abstract":"In this tutorial we discuss the impact of multicore architectures for embedded devices at different levels, ranging from heterogeneous/homogeneous ISAs to the organization and software development.","PeriodicalId":136293,"journal":{"name":"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems","volume":"168 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123166532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Arup Chakraborty, H. Homayoun, A. Djahromi, N. Dutt, A. Eltawil, F. Kurdahi
{"title":"E < MC2: less energy through multi-copy cache","authors":"Arup Chakraborty, H. Homayoun, A. Djahromi, N. Dutt, A. Eltawil, F. Kurdahi","doi":"10.1145/1878921.1878956","DOIUrl":"https://doi.org/10.1145/1878921.1878956","url":null,"abstract":"Caches are known to consume a large part of total microprocessor power. Traditionally, voltage scaling has been used to reduce both dynamic and leakage power in caches. However, aggressive voltage reduction causes process-variation-induced failures in cache SRAM arrays, which compromise cache reliability. We present Multi-Copy Cache (MC2), a new cache architecture that achieves significant reduction in energy consumption through aggressive voltage scaling, while maintaining high error resilience (reliability) by exploiting multiple copies of each data item in the cache. Unlike many previous approaches, MC2 does not require any error map characterization and therefore is responsive to changing operating conditions (e.g., Vdd-noise, temperature and leakage) of the cache. MC2 also incurs significantly lower overheads compared to other ECC-based caches. Our experimental results on embedded benchmarks demonstrate that MC2 achieves up to 60% reduction in energy and energy-delay product (EDP) with only 3.5% reduction in IPC and no appreciable area overhead.","PeriodicalId":136293,"journal":{"name":"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122884368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Balancing memory and performance through selective flushing of software code caches","authors":"Apala Guha, K. Hazelwood, M. Soffa","doi":"10.1145/1878921.1878923","DOIUrl":"https://doi.org/10.1145/1878921.1878923","url":null,"abstract":"Dynamic binary translators (DBTs) are becoming increasingly important because of their power and flexibility. However, the high memory demands of DBTs present an obstacle for all platforms, and especially embedded systems. The memory demand is typically controlled by placing a limit on cached translations and forcing the DBT to flush all translations upon reaching the limit. This solution manifests as a performance inefficiency because many flushed translations require retranslation. Ideally, translations should be selectively flushed to minimize retranslations for a given memory limit. However, three obstacles exist:(1) it is difficult to predict which selections will minimize retranslation,(2) selective flushing results in greater book-keeping overheads than full flushing, and(3) the emergence of multicore processors and multi-threaded programming complicates most flushing algorithms. These issues have led to the widespread adoption of full flushing as a standard protocol. In this paper, we present a partial flushing approach aimed at reducing retranslation overhead and improving overall performance, given a fixed memory budget. Our technique applies uniformly to single-threaded and multi-threaded guest applications","PeriodicalId":136293,"journal":{"name":"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130368215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xuejun Yang, Li Wang, Jingling Xue, T. Tang, Xiaoguang Ren, S. Ye
{"title":"Improving scratchpad allocation with demand-driven data tiling","authors":"Xuejun Yang, Li Wang, Jingling Xue, T. Tang, Xiaoguang Ren, S. Ye","doi":"10.1145/1878921.1878942","DOIUrl":"https://doi.org/10.1145/1878921.1878942","url":null,"abstract":"Existing scratchpad memory (SPM) allocation algorithms for arrays, whether they rely on well-crafted heuristics or resort to integer linear programming (ILP) techniques, typically assume that every array is small enough to fit directly into the SPM. As a result, some arrays have to be spilled entirely to the off-chip memory in order to make room for other arrays to stay in the SPM, resulting in sometimes poor SPM utilization.\u0000 In this paper, we introduce a new comparability graph coloring allocator that integrates for the first time data tiling and SPM allocation for arrays by tiling arrays on-demand to improve utilization of the SPM. The novelty lies in repeatedly identifying the heaviest path in an array interference graph and then reducing its weight by tiling certain arrays on the path appropriately with respect to the size of the SPM. The effectiveness of our allocator, which is presently restricted to tiling 1-D arrays, is validated by using a number of selected benchmarks for which existing allocators are ineffective.","PeriodicalId":136293,"journal":{"name":"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114231140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mosaic of organic development through technology intervention in the rural indian context","authors":"Rajeswari Pingali, P. Niranjana","doi":"10.1145/1878921.1878929","DOIUrl":"https://doi.org/10.1145/1878921.1878929","url":null,"abstract":"The bottom of the pyramid (BOP) concept had been introduced by Prof. C. K. Prahlad (C. K. Prahalad (August 8, 1941 – April 16, 2010) was a globally known figure who consulted the top management of many of the world's foremost companies. He was the Paul and Ruth McCracken Distinguished University Professor of Corporate Strategy at the Stephen M. Ross School of Business in the University of Michigan, around late 2000. It had been taken up with unusual enthusiasm from the corporate entities across the world who started to see the people of India with small incomes as a fertile bed for wealth creation and Government of India too looked at BOP as a mechanism for engaging with the corporate sector through Public Private Partnerships for creating efficient process for governance. However, some sceptics like us view BOP differently, owing to the unique concern we have about the process of wealth distribution that is enabled by BOP.","PeriodicalId":136293,"journal":{"name":"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127592062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Z. Kedem, V. Mooney, Kirthi Krishna Muntimadugu, K. Palem, Avani Devarasetty, Phani Deepak Parasuramuni
{"title":"Optimizing energy to minimize errors in dataflow graphs using approximate adders","authors":"Z. Kedem, V. Mooney, Kirthi Krishna Muntimadugu, K. Palem, Avani Devarasetty, Phani Deepak Parasuramuni","doi":"10.1145/1878921.1878948","DOIUrl":"https://doi.org/10.1145/1878921.1878948","url":null,"abstract":"Approximate arithmetic is a promising, new approach to low-energy designs while tackling reliability issues. We present a method to optimally distribute a given energy budget among adders in a dataflow graph so as to minimize expected errors. The method is based on new formal mathematical models and algorithms, which quantitatively characterize the relative importance of the adders in a circuit. We demonstrate this method on a finite impulse response filter and a Fast Fourier Transform. The optimized energy distribution yields 2.05X lower error in a 16-point FFT and images with SNR 1.42X higher than those achieved by the best previous approach.","PeriodicalId":136293,"journal":{"name":"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114074811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}