ACM Trans. Embed. Comput. Syst.最新文献_第3页

ACDC: Small, Predictable and High-Performance Data Cache ACDC:小型、可预测的高性能数据缓存

ACM Trans. Embed. Comput. Syst. Pub Date : 2015-03-25 DOI: 10.1145/2677093

J. Segarra, Clemente Rodríguez Lafuente, Rubén Gran Tejero, L. Aparicio, V. Viñals

{"title":"ACDC: Small, Predictable and High-Performance Data Cache","authors":"J. Segarra, Clemente Rodríguez Lafuente, Rubén Gran Tejero, L. Aparicio, V. Viñals","doi":"10.1145/2677093","DOIUrl":"https://doi.org/10.1145/2677093","url":null,"abstract":"In multitasking real-time systems, the worst-case execution time (WCET) of each task and also the effects of interferences between tasks in the worst-case scenario need to be calculated. This is especially complex in the presence of data caches. In this article, we propose a small instruction-driven data cache (256 bytes) that effectively exploits locality. It works by preselecting a subset of memory instructions that will have data cache replacement permission. Selection of such instructions is based on data reuse theory. Since each selected memory instruction replaces its own data cache line, it prevents pollution and performance in tasks becomes independent of the size of the associated data structures. We have modeled several memory configurations using the Lock-MS WCET analysis method. Our results show that, on average, our data cache effectively services 88% of program data of the tested benchmarks. Such results double the worst-case performance of our tested multitasking experiments. In addition, in the worst case, they reach between 75% and 89% of the ideal case of always hitting in instruction and data caches. As well, we show that using partitioning on our proposed hardware only provides marginal benefits in worst-case performance, so using partitioning is discouraged. Finally, we study the viability of our proposal in the MiBench application suite by characterizing its data reuse, achieving hit ratios beyond 90% in most programs.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128253864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Effective Runtime Resource Management Using Linux Control Groups with the BarbequeRTRM Framework 有效的运行时资源管理使用Linux控制组与BarbequeRTRM框架

ACM Trans. Embed. Comput. Syst. Pub Date : 2015-03-25 DOI: 10.1145/2658990

P. Bellasi, G. Massari, W. Fornaciari

{"title":"Effective Runtime Resource Management Using Linux Control Groups with the BarbequeRTRM Framework","authors":"P. Bellasi, G. Massari, W. Fornaciari","doi":"10.1145/2658990","DOIUrl":"https://doi.org/10.1145/2658990","url":null,"abstract":"The extremely high technology process reached by silicon manufacturing (smaller than 32nm) has led to production of computational platforms and SoC, featuring a considerable amount of resources. Whereas from one side such multi- and many-core platforms show growing performance capabilities, from the other side they are more and more affected by power, thermal, and reliability issues. Moreover, the increased computational capabilities allows congested usage scenarios with workloads subject to mixed and time-varying requirements. Effective usage of the resources should take into account both the application requirements and resources availability, with an arbiter, namely a resource manager in charge to solve the resource contention among demanding applications.\u0000 Current operating systems (OS) have only a limited knowledge about application-specific behaviors and their time-varying requirements. Dedicated system interfaces to collect such inputs and forward them to the OS (e.g., its scheduler) are thus an interesting research area that aims at integrating the OS with an ad hoc resource manager. Such a component can exploit efficient low-level OS interfaces and mechanisms to extend its capabilities of controlling tasks and system resources. Because of the specific tasks and timings of a resource manager, this component can be easily and effectively developed as a user-space extension lying in between the OS and the controlled application.\u0000 This article, which focuses on multicore Linux systems, shows a portable solution to enforce runtime resource management decisions based on the standard control groups framework. A burst and a mixed workload analysis, performed on a multicore-based NUMA platform, have reported some promising results both in terms of performance and power saving.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121209846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 43

A Real-Time Multichannel Memory Controller and Optimal Mapping of Memory Clients to Memory Channels 一个实时多通道内存控制器和内存客户端到内存通道的最佳映射

ACM Trans. Embed. Comput. Syst. Pub Date : 2015-03-25 DOI: 10.1145/2661635

M. Gomony, B. Akesson, K. Goossens

{"title":"A Real-Time Multichannel Memory Controller and Optimal Mapping of Memory Clients to Memory Channels","authors":"M. Gomony, B. Akesson, K. Goossens","doi":"10.1145/2661635","DOIUrl":"https://doi.org/10.1145/2661635","url":null,"abstract":"Ever-increasing demands for main memory bandwidth and memory speed/power tradeoff led to the introduction of memories with multiple memory channels, such as Wide IO DRAM. Efficient utilization of a multichannel memory as a shared resource in multiprocessor real-time systems depends on mapping of the memory clients to the memory channels according to their requirements on latency, bandwidth, communication, and memory capacity. However, there is currently no real-time memory controller for multichannel memories, and there is no methodology to optimally configure multichannel memories in real-time systems. As a first work toward this direction, we present two main contributions in this article: (1) a configurable real-time multichannel memory controller architecture with a novel method for logical-to-physical address translation and (2) two design-time methods to map memory clients to the memory channels, one an optimal algorithm based on an integer programming formulation of the mapping problem, and the other a fast heuristic algorithm. We demonstrate the real-time guarantees on bandwidth and latency provided by our multichannel memory controller architecture by experimental evaluation. Furthermore, we compare the performance of the mapping problem formulation in a solver and the heuristic algorithm against two existing mapping algorithms in terms of computation time and mapping success ratio. We show that an optimal solution can be found in 2 hours using the solver and in less than 1 second with less than 7% mapping failure using the heuristic for realistically sized problems. Finally, we demonstrate configuring a Wide IO DRAM in a high-definition (HD) video and graphics processing system to emphasize the practical applicability and effectiveness of this work.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133773721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 18

Runtime Optimization of System Utility with Variable Hardware 可变硬件系统实用程序的运行时优化

ACM Trans. Embed. Comput. Syst. Pub Date : 2015-03-25 DOI: 10.1145/2656338

Paul D. Martin, L. Wanner, M. Srivastava

{"title":"Runtime Optimization of System Utility with Variable Hardware","authors":"Paul D. Martin, L. Wanner, M. Srivastava","doi":"10.1145/2656338","DOIUrl":"https://doi.org/10.1145/2656338","url":null,"abstract":"Increasing hardware variability in newer integrated circuit fabrication technologies has caused corresponding power variations on a large scale. These variations are particularly exaggerated for idle power consumption, motivating the need to mitigate the effects of variability in systems whose operation is dominated by long idle states with periodic active states. In systems where computation is severely limited by anemic energy reserves and where a long overall system lifetime is desired, maximizing the quality of a given application subject to these constraints is both challenging and an important step toward achieving high-quality deployments. This work describes VaRTOS, an architecture and corresponding set of operating system abstractions that provide explicit treatment of both idle and active power variations for tasks running in real-time operating systems. Tasks in VaRTOS express elasticity by exposing individual knobs—shared variables that the operating system can tune to adjust task quality and, correspondingly, task power, maximizing application utility both on a per-task and on a system-wide basis. We provide results regarding online learning of instance-specific sleep power, active power, and task-level power expenditure on simulated hardware with demonstrated effects for several prototypical applications. Our results on networked sensing applications, which are representative of a broader category of applications that VaRTOS targets, show that VaRTOS can reduce variability-induced energy expenditure errors from over 70% in many cases to under 2% in most cases and under 5% in the worst case.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116835457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Using a Flexible Fault-Tolerant Cache to Improve Reliability for Ultra Low Voltage Operation 利用柔性容错缓存提高超低电压运行可靠性

ACM Trans. Embed. Comput. Syst. Pub Date : 2015-03-25 DOI: 10.1145/2629566

Abbas BanaiyanMofrad, H. Homayoun, N. Dutt

{"title":"Using a Flexible Fault-Tolerant Cache to Improve Reliability for Ultra Low Voltage Operation","authors":"Abbas BanaiyanMofrad, H. Homayoun, N. Dutt","doi":"10.1145/2629566","DOIUrl":"https://doi.org/10.1145/2629566","url":null,"abstract":"Caches are known to consume a large part of total microprocessor power. Traditionally, voltage scaling has been used to reduce both dynamic and leakage power in caches. However, aggressive voltage reduction causes process-variation--induced failures in cache SRAM arrays, which compromise cache reliability. In this article, we propose FFT-Cache, a flexible fault-tolerant cache that uses a flexible defect map to configure its architecture to achieve significant reduction in energy consumption through aggressive voltage scaling while maintaining high error reliability. FFT-Cache uses a portion of faulty cache blocks as redundancy—using block-level or line-level replication within or between sets—to tolerate other faulty caches lines and blocks. Our configuration algorithm categorizes the cache lines based on degree of conflict between their blocks to reduce the granularity of redundancy replacement. FFT-Cache thereby sacrifices a minimal number of cache lines to avoid impacting performance while tolerating the maximum amount of defects. Our experimental results on a processor executing SPEC2K benchmarks demonstrate that the operational voltage of both L1/L2 caches can be reduced down to 375 mV, which achieves up to 80% reduction in the dynamic power and up to 48% reduction in the leakage power. This comes with only a small performance loss (<%5) and 13% area overhead.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129593217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Editorial: Oh Security—Where Art Thou? 社论:哦，安全——你在哪里?

ACM Trans. Embed. Comput. Syst. Pub Date : 2015-03-25 DOI: 10.1145/2742044

S. Shukla

{"title":"Editorial: Oh Security—Where Art Thou?","authors":"S. Shukla","doi":"10.1145/2742044","DOIUrl":"https://doi.org/10.1145/2742044","url":null,"abstract":"As I write this editorial for Volume 14, Issue 2 of the ACM Transactions on Embedded Computing Systems, I am riled up with the concern that my medical data, together with many personal information might be in the hands of some identity thief—post the security breach of my health insurer Anthem. It seems that tens of millions of customer data might have been stolen by hackers, which could include me and many of my colleagues. This is not the only one on our mind these days. Right before the winter holidays of 2014, a German steel plant was struck by hackers—they manipulated and disrupted the control system of the plant and caused physical damages. Also, who can forget that the breach of SONY Entertainment caused an uproar right before that—and ended up determining the fate of a movie's impending world wide release? These are but a few highly publicized cases. According to the reports I read, most government information systems around the world are targeted hundreds of times a day by hackers—malicious or benign. We have created a digital world – the interconnected world of devices, machines, and systems, and the flip side of all that is the incessant attacks and insecurity. While the Anthem breach leaves us with the possibility of loss of privacy, identity theft, and other malicious use of our personal information by miscreants, the attack on the German steel plant leaves us with the possibility of cyberattacks that could lead to another Bhopal disaster or a Chernobyl, depending on how sophisticated and massive the attack might be on existing chemical or nuclear plants. On top of all these, the extant cybersecurity of all these systems are not only insufficient, they are also often retrofitted without a proper proof of security. While in the past such systems have been isolated from the prying eyes of hackers through air gap and obscurity, it is no longer the case. The IP convergence that provides the comfort of browsing the live data on the state of the plants from the offices and home of engineers also created the Achilles heel of such systems. With the growth of handheld devices and high-speed wireless networking, there is no going back on that—while we stand exposed to possibilities of huge industrial accidents in the hands of hackers who might be even state actors. In this brave new world, we need to make cybersecurity …","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"310 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115551902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Temperature-Aware Data Allocation for Embedded Systems with Cache and Scratchpad Memory 基于高速缓存和刮本存储器的嵌入式系统的温度感知数据分配

ACM Trans. Embed. Comput. Syst. Pub Date : 2015-03-25 DOI: 10.1145/2629650

Zhiping Jia, Yang Li, Yi Wang, M. Wang, Z. Shao

引用次数: 6

Placement of Linked Dynamic Data Structures over Heterogeneous Memories in Embedded Systems 嵌入式系统中异构存储器上链接动态数据结构的放置

ACM Trans. Embed. Comput. Syst. Pub Date : 2015-03-25 DOI: 10.1145/2656208

Miguel Peón Quirós, A. Bartzas, S. Mamagkakis, F. Catthoor, J. Mendias, D. Soudris

引用次数: 2

Automatic Update of Indoor Location Fingerprints with Pedestrian Dead Reckoning 基于行人航位推算的室内位置指纹自动更新

ACM Trans. Embed. Comput. Syst. Pub Date : 2015-03-25 DOI: 10.1145/2667226

Daisuke Taniuchi, T. Maekawa