{"title":"Cache-Related Preemption Delay Analysis for Multilevel Noninclusive Caches","authors":"Sudipta Chattopadhyay, Abhik Roychoudhury","doi":"10.1145/2632156","DOIUrl":"https://doi.org/10.1145/2632156","url":null,"abstract":"With the rapid growth of complex hardware features, timing analysis has become an increasingly difficult problem. The key to solving this problem lies in the precise and scalable modeling of performance-enhancing processor features (e.g., cache). Moreover, real-time systems are often multitasking and use preemptive scheduling, with fixed or dynamic priority assignment. For such systems, cache related preemption delay (CRPD) may increase the execution time of a task. Therefore, CRPD may affect the overall schedulability analysis. Existing works propose to bound the value of CRPD in a single-level cache. In this article, we propose a CRPD analysis framework that can be used for a two-level, noninclusive cache hierarchy. In addition, our proposed framework is also applicable in the presence of shared caches. We first show that CRPD analysis faces several new challenges in the presence of a multilevel, noninclusive cache hierarchy. Our proposed framework overcomes all such challenges and we can formally prove the correctness of our framework. We have performed experiments with several subject programs, including an unmanned aerial vehicle (UAV) controller and an in-situ space debris monitoring instrument. Our experimental results suggest that we can provide sound and precise CRPD estimates using our framework.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126768218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Minimizing Stack and Communication Memory Usage in Real-Time Embedded Applications","authors":"Haibo Zeng, M. Natale, Qi Zhu","doi":"10.1145/2632160","DOIUrl":"https://doi.org/10.1145/2632160","url":null,"abstract":"In the development of real-time embedded applications, especially those on systems-on-chip, an efficient use of RAM memory is as important as the effective scheduling of the computation resources. The protection of communication and state variables accessed by concurrent tasks must provide real-time schedulability guarantees while using the least amount of memory. Several schemes, including preemption thresholds, have been developed to improve schedulability and save stack space by selectively disabling preemption. However, the design synthesis problem is still open. In this article, we target the assignment of the scheduling parameters to minimize memory usage for systems of practical interest, including designs compliant with automotive standards. We propose algorithms either proven optimal or shown to improve on randomized optimization methods like simulated annealing.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"145 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124434328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Diamantopoulos, Efstathios Sotiriou-Xanthopoulos, K. Siozios, G. Economakos, D. Soudris
{"title":"Plug&Chip: A Framework for Supporting Rapid Prototyping of 3D Hybrid Virtual SoCs","authors":"D. Diamantopoulos, Efstathios Sotiriou-Xanthopoulos, K. Siozios, G. Economakos, D. Soudris","doi":"10.1145/2661634","DOIUrl":"https://doi.org/10.1145/2661634","url":null,"abstract":"In the embedded system domain there is a continuous demand towards providing higher flexibility for application development. This trend strives for virtual prototyping solutions capable of performing fast system simulation. Among other benefits, such a solution supports concurrent hardware/software system design by enabling to start developing, testing, and validating the embedded software substantially earlier than has been possible in the past. Towards this direction, throughout this article we introduce a new framework, named Plug&Chip, targeting to support rapid prototyping of 2D and 3D digital systems. In contrast to other relevant approaches, our solution provides higher flexibility by enabling incremental system design, while also handling platforms developed with the usage of 3D integration technology.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127514536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Provably Good Task Assignment for Two-Type Heterogeneous Multiprocessors Using Cutting Planes","authors":"Björn Andersson, Gurulingesh Raravi","doi":"10.1145/2660495","DOIUrl":"https://doi.org/10.1145/2660495","url":null,"abstract":"Consider scheduling of real-time tasks on a multiprocessor where migration is forbidden. Specifically, consider the problem of determining a task-to-processor assignment for a given collection of implicit-deadline sporadic tasks upon a multiprocessor platform in which there are two distinct types of processors. For this problem, we propose a new algorithm, LPC (task assignment based on solving a Linear Program with Cutting planes). The algorithm offers the following guarantee: for a given task set and a platform, if there exists a feasible task-to-processor assignment, then LPC succeeds in finding such a feasible task-to-processor assignment as well but on a platform in which each processor is 1.5 × faster and has three additional processors. For systems with a large number of processors, LPC has a better approximation ratio than state-of-the-art algorithms. To the best of our knowledge, this is the first work that develops a provably good real-time task assignment algorithm using cutting planes.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132114047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Editorial: Embedded, Cyber-Physical, Hybrid…","authors":"S. Shukla","doi":"10.1145/2678027","DOIUrl":"https://doi.org/10.1145/2678027","url":null,"abstract":"This editorial is the last one in Volume 13 of this journal. Thanks to the ACM publications staff especially Laura Lander, our journal assistant, Ms. Jeyel Tecson, and, most importantly, our production manager, Joanne Pello. We have been able to fit all the pending special issues and many articles waiting in the queue for a very long time inside the various extra issues of Volume 13, some of which are online-only issues to save pages and time. The next issue will begin Volume 14 of this journal, and I note that with great admiration for everyone who helped us clear the immense backlog – in particular the special issue articles. Before I briefly summarize the content of this last issue of Volume 13, let me bring up a topic germane to this journal and to the entire field of embedded systems research and education. It is about terminology, taxonomy of research areas or sub-disciplines that often confuse us, and often launch us into great debate at conference venues and other meeting places. This journal is titled Transactions on Embedded Computing Systems. What exactly does the discipline \" embedded computing systems \" entail? These days, the most thrown-around term is cyber-physical systems (CPS). In the mid-nineties there were a lot of discussions on hybrid systems (HS). What are their ontological relationships? It is actually interesting to note that, while all of these are related and often used in place of each other, certainly, chronologically, CPS is the newest buzzword that has been making the rounds for about the last 10 years, starting with the US federal funding agencies and gradually reaching worldwide acceptance. Hybrid systems initially meant systems where at least two models of computation interact – in particular, a continuous time one, representing the dynamics of a physical system, and a discrete time one, representing digital computer-based implementation of control algorithms that oversee such physical dynamics. In the early nineties, much of the research in hybrid systems saw a convergence of control systems research and digital implementation of control algorithms. In particular, the main focus was on formal models as well as the representation and analysis of such models for verification and synthesis purposes. Hybrid automata, rectangular automata, hybrid I/O automata, timed automata, and many other formal model and analysis techniques were proposed and implemented. Certainly, these models and analysis techniques that originated from the need to verify that control …","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132490490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamically Instrumenting the QEMU Emulator for Linux Process Trace Generation with the GDB Debugger","authors":"B. Mihajlovic, Z. Zilic, W. Gross","doi":"10.1145/2678022","DOIUrl":"https://doi.org/10.1145/2678022","url":null,"abstract":"In software debugging, trace generation techniques are used to resolve highly complex bugs. However, the emulators increasingly used for embedded software development do not yet offer the types of trace generation infrastructure available in hardware. In this article, we make changes to the ARM ISA emulation of the QEMU emulator to allow for continuous instruction-level trace generation. Using a standard GDB client, tracepoints can be inserted to dynamically log registers and memory addresses without altering executing code. The ability to run trace experiments in five different modes allows the scope of trace generation to be narrowed as needed, down to the level of a single Linux process. Our scheme collects the execution traces of a Linux process on average between 9.6x--0.7x the speed of existing QEMU trace capabilities, with 96.7% less trace data volume. Compared to a software-instrumented tracing scheme, our method is both unobtrusive and performs on average between 3--4 orders of magnitude faster.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128913586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scheduling Temporal Data with Dynamic Snapshot Consistency Requirement in Vehicular Cyber-Physical Systems","authors":"Kai Liu, V. Lee, J. Ng, S. Son, E. Sha","doi":"10.1145/2629546","DOIUrl":"https://doi.org/10.1145/2629546","url":null,"abstract":"Timely and efficient data dissemination is one of the fundamental requirements to enable innovative applications in vehicular cyber-physical systems (VCPS). In this work, we intensively analyze the characteristics of temporal data dissemination in VCPS. On this basis, we formulate the static and dynamic snapshot consistency requirements on serving real-time requests for temporal data items. Two online algorithms are proposed to enhance the system performance with different requirements. In particular, a reschedule mechanism is developed to make the scheduling adaptable to the dynamic snapshot consistency requirement. A comprehensive performance evaluation demonstrates the superiority of the proposed algorithms.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121571165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An analytical approach for fast and accurate design space exploration of instruction caches","authors":"Yun Liang, T. Mitra","doi":"10.1145/2539036.2539039","DOIUrl":"https://doi.org/10.1145/2539036.2539039","url":null,"abstract":"Application-specific system-on-chip platforms create the opportunity to customize the cache configuration for optimal performance with minimal chip area. Simulation, in particular trace-driven simulation, is widely used to estimate cache hit rates. However, simulation is too slow to be deployed in design space exploration, especially when there are hundreds of design points and the traces are huge. In this article, we propose a novel analytical approach for design space exploration of instruction caches. Given the program control flow graph (CFG) annotated only with basic block and control flow edge execution counts, we first model the cache states at each point of the CFG in a probabilistic manner. Then, we exploit the structural similarities among related cache configurations to estimate the cache hit rates for multiple cache configurations in one pass. Experimental results indicate that our analysis is 28--2,500 times faster compared to the fastest known cache simulator while maintaining high accuracy (0.2% average error) in estimating cache hit rates for a large set of popular benchmarks. Moreover, compared to a state-of-the-art cache design space exploration technique, our approach achieves 304--8,086 times speedup and saves up to 62% (average 7%) energy for the evaluated benchmarks.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114887697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An adaptive, low-cost wear-leveling algorithm for multichannel solid-state disks","authors":"Li-Pin Chang, Tung-Yang Chou, Li-Chun Huang","doi":"10.1145/2539036.2539051","DOIUrl":"https://doi.org/10.1145/2539036.2539051","url":null,"abstract":"Multilevel flash memory cells double or even triple storage density, producing affordable solid-state disks for end users. As flash memory endures only limited program-erase cycles, solid-state disks employ wear-leveling methods to prevent any portions of flash memory from being retired prematurely. Modern solid-state disks must consider wear evenness at both block and channel levels. This study first presents a block-level wear-leveling method whose design has two new ideas. First, the proposed method reuses the intelligence available in flash-translation layers so it does not require any new data structures. Second, it adaptively tunes the threshold of block-level wear leveling according to the runtime write pattern. This study further introduces a new channel-level wear-leveling strategy, because block-level wear leveling is confined to a channel, but realistic workloads do not evenly write all channels. The proposed method swaps logical blocks among channels for achieving an eventually-even state of channel lifetimes. A series of trace-driven simulations show that our wear-leveling method outperforms existing approaches in terms of wear evenness and overhead reduction.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124077291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yu-Ming Chang, P. Hsiu, Yuan-Hao Chang, Che-Wei Chang
{"title":"A resource-driven DVFS scheme for smart handheld devices","authors":"Yu-Ming Chang, P. Hsiu, Yuan-Hao Chang, Che-Wei Chang","doi":"10.1145/2539036.2539049","DOIUrl":"https://doi.org/10.1145/2539036.2539049","url":null,"abstract":"Reducing the energy consumption of the emerging genre of smart handheld devices while simultaneously maintaining mobile applications and services is a major challenge. This work is inspired by an observation on the resource usage patterns of mobile applications. In contrast to existing DVFS scheduling algorithms and history-based prediction techniques, we propose a resource-driven DVFS scheme in which resource state machines are designed to model the resource usage patterns in an online fashion to guide DVFS. We have implemented the proposed scheme on Android smartphones and conducted experiments based on real-world applications. The results are very encouraging and demonstrate the efficacy of the proposed scheme.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115510237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}