ACM Trans. Embed. Comput. Syst.最新文献

筛选
英文 中文
Configuration and operation of networked control systems over heterogeneous WSANs 异构无线局域网网络控制系统的配置和运行
ACM Trans. Embed. Comput. Syst. Pub Date : 2013-11-01 DOI: 10.1145/2536747.2536756
P. Furtado, J. Cecílio
{"title":"Configuration and operation of networked control systems over heterogeneous WSANs","authors":"P. Furtado, J. Cecílio","doi":"10.1145/2536747.2536756","DOIUrl":"https://doi.org/10.1145/2536747.2536756","url":null,"abstract":"There have been both research and commercial advances on applying Wireless Sensor and Actuator Networks (WSN) in industrial premises. These have cost advantages related to avoiding some cabled deployments. A possible architecture involves a Networked Control System (NCS) with many small WSN subnetworks, cabled nodes and computer servers (e.g., servers, control stations). In those systems individual sensor nodes can be programmed, as opposed to cabled analog systems. We investigate approaches for networked-wide configuration, where all nodes—cabled or WSN sensors—can be configured with simplicity from a single interface, instead of hand-coding or complex configurations of individual nodes. We propose an architecture and approach for configuration and operation. Previous related proposals on middleware involving WSNs suffer from two major limitations: they either program within an individual WSN or configure operation outside WSNs, wrapping data coming from WSN. They do not allow configuring WSN and non-WSN nodes for operation from a single interface. We discuss the architecture and propose the NCSWSN configuration and operation approach. We are applying this system in an industrial testbed, therefore we test the approach and also show user interfaces and results from the deployment.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130467049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A system-level infrastructure for multidimensional MP-SoC design space co-exploration 多维MP-SoC设计空间协同探索的系统级基础设施
ACM Trans. Embed. Comput. Syst. Pub Date : 2013-11-01 DOI: 10.1145/2536747.2536749
Zai Jian Jia, T. Bautista, A. Núñez, A. Pimentel, M. Thompson
{"title":"A system-level infrastructure for multidimensional MP-SoC design space co-exploration","authors":"Zai Jian Jia, T. Bautista, A. Núñez, A. Pimentel, M. Thompson","doi":"10.1145/2536747.2536749","DOIUrl":"https://doi.org/10.1145/2536747.2536749","url":null,"abstract":"In this article, we present a flexible and extensible system-level MP-SoC design space exploration (DSE) infrastructure, called NASA. This highly modular framework uses well-defined interfaces to easily integrate different system-level simulation tools as well as different combinations of search strategies in a simple plug-and-play fashion. Moreover, NASA deploys a so-called dimension-oriented DSE approach, allowing designers to configure the appropriate number of, well-tuned and possibly different, search algorithms to simultaneously co-explore the various design space dimensions. As a result, NASA provides a flexible and re-usable framework for the systematic exploration of the multidimensional MP-SoC design space, starting from a set of relatively simple user specifications. To demonstrate the capabilities of the NASA framework and to illustrate its distinct aspects, we also present several DSE experiments in which, for example, we compare NASA configurations using a single search algorithm for all design space dimensions to configurations using a separate search algorithm per dimension. These proof-of-concept experiments indicate that the latter multidimensional co-exploration can find better design points and evaluates a higher diversity of design alternatives as compared to the more traditional approach of using a single search algorithm for all dimensions.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126621934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
System-level memory management based on statistical variability compensation for frame-based applications 基于统计可变性补偿的基于帧的应用程序的系统级内存管理
ACM Trans. Embed. Comput. Syst. Pub Date : 2013-11-01 DOI: 10.1145/2536747.2536757
Concepción Sanz, J. I. Gómez, C. Tenllado, M. Prieto, F. Catthoor
{"title":"System-level memory management based on statistical variability compensation for frame-based applications","authors":"Concepción Sanz, J. I. Gómez, C. Tenllado, M. Prieto, F. Catthoor","doi":"10.1145/2536747.2536757","DOIUrl":"https://doi.org/10.1145/2536747.2536757","url":null,"abstract":"Process variability and dynamic domains increase the uncertainty of embedded systems and force designers to apply pessimistic designs, which become unnecessarily conservative and have a tremendous impact on both performance and energy consumption. In this context, developing uncertainty-aware design methodologies that take both variation at platform and at application level into account becomes a must. These methodologies should mitigate the effects derived from uncertainty, avoiding worst-case assumptions. In this article we propose a comprehensive methodology to tackle two forms of uncertainty: (1) process variation on the memory system, (2) application dynamism. A statistical model has been developed to deal with variability derived from fabrication process, whereas system scenarios are selected to cope with dynamic domains. Both sources of uncertainty are firstly tackled in combination at design time, to be refined later, at setup. As a result, at run time the platform can be successfully adapted to the current application behaviour as well as the current variations. Our simulations show that this methodology provides significant energy savings while still meeting strict timing constraints.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"1060 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132057805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Automatic generation of high-speed accurate TLM models for out-of-order pipelined bus 失序流水线总线高速精确TLM模型的自动生成
ACM Trans. Embed. Comput. Syst. Pub Date : 2013-11-01 DOI: 10.1145/2536747.2536759
C. Lo, Mao Lin Li, Li-Chun Chen, Yi-Shan Lu, R. Tsay, Hsu-Yao Huang, J. Yeh
{"title":"Automatic generation of high-speed accurate TLM models for out-of-order pipelined bus","authors":"C. Lo, Mao Lin Li, Li-Chun Chen, Yi-Shan Lu, R. Tsay, Hsu-Yao Huang, J. Yeh","doi":"10.1145/2536747.2536759","DOIUrl":"https://doi.org/10.1145/2536747.2536759","url":null,"abstract":"Although pipelined/out-of-order (PL/OO) execution features are commonly supported by the state-of-the-art bus designs, no existing manual Transaction-Level-Modeling (TLM) approaches can effectively construct fast and accurate simulation models for PL/OO buses. Mainly, the inherent high design complexity of concurrent PL/OO behaviors makes the manual approaches tedious and error-prone. To tackle the complicated modeling task, this article presents an automatic approach that performs systematic abstraction and generation of fast-and-accurate simulation models. The experimental results show that our approach reduces 21 times modeling efforts, while our generated models perform simulation an order of magnitude faster than Cycle-Accurate models with the same PL/OO transaction execution cycle counts preserved.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128534679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Adaptive scheduling of real-time systems cosupplied by renewable and nonrenewable energy sources 可再生能源和不可再生能源共供实时系统的自适应调度
ACM Trans. Embed. Comput. Syst. Pub Date : 2013-11-01 DOI: 10.1145/2536747.2536758
M. Mohaqeqi, M. Kargahi, Maryam Dehghan
{"title":"Adaptive scheduling of real-time systems cosupplied by renewable and nonrenewable energy sources","authors":"M. Mohaqeqi, M. Kargahi, Maryam Dehghan","doi":"10.1145/2536747.2536758","DOIUrl":"https://doi.org/10.1145/2536747.2536758","url":null,"abstract":"Energy management is an important issue in today's real-time systems due to the high costs of energy supplying. Using renewable, like wave, wind, and solar energy sources seem promising methods to address this issue. However, because of the existing contrast between the critical nature of hard real-time systems and the unpredictable nature of renewable energies, some supplementary energy source like electricity grid or battery is needed. In this paper, we consider hard real-time systems with two renewable and nonrenewable energy sources. In order to reduce the costs, we present two dynamic voltage scaling controllers to minimize the energy attained from the latter source. In order to handle variations of the environmental energy and workload, the model predictive control approach is employed. One nonlinear approach beside one fast linear piecewise affine explicit controller are proposed. The efficacies of the proposed approaches have been investigated through extensive simulations. Comparisons to an ideal clairvoyant controller as a baseline show that, in the studied scenarios, the proposed controllers guarantee at least 78% of the baseline performance.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133214569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Automated generation of polyhedral process networks from affine nested-loop programs with dynamic loop bounds 具有动态循环边界的仿射嵌套循环程序自动生成多面体过程网络
ACM Trans. Embed. Comput. Syst. Pub Date : 2013-11-01 DOI: 10.1145/2536747.2536750
D. Nadezhkin, Hristo Nikolov, T. Stefanov
{"title":"Automated generation of polyhedral process networks from affine nested-loop programs with dynamic loop bounds","authors":"D. Nadezhkin, Hristo Nikolov, T. Stefanov","doi":"10.1145/2536747.2536750","DOIUrl":"https://doi.org/10.1145/2536747.2536750","url":null,"abstract":"The Process Networks (PNs) is a suitable parallel model of computation (MoC) used to specify embedded streaming applications in a parallel form facilitating the efficient mapping onto embedded parallel execution platforms. Unfortunately, specifying an application using a parallel MoC is a very difficult and highly error-prone task. To overcome the associated difficulties, we have developed the pn compiler, which derives specific Polyhedral Process Networks (PPN) parallel specifications from sequential static affine nested loop programs (SANLPs). However, there are many applications, for example, multimedia applications (MPEG coders/decoders, smart cameras, etc.) that have adaptive and dynamic behavior which cannot be expressed as SANLPs. Therefore, in order to handle dynamic multimedia applications, in this article we address the important question whether we can relax some of the restrictions of the SANLPs while keeping the ability to perform compile-time analysis and to derive PPNs. Achieving this would significantly extend the range of applications that can be parallelized in an automated way.\u0000 The main contribution of this article is a first approach for automated translation of affine nested loop programs with dynamic loop bounds into input-output equivalent Polyhedral Process Networks. In addition, we present a method for analyzing the execution overhead introduced in the PPNs derived from programs with dynamic loop bounds. The presented automated translation approach has been evaluated by deriving a PPN parallel specification from a real-life application called Low Speed Obstacle Detection (LSOD) used in the smart cameras domain. By executing the derived PPN, we have obtained results which indicate that the approach we present in this article facilitates efficient parallel implementations of sequential nested loop programs with dynamic loop bounds. That is, our approach reveals the possible parallelism available in such applications, which allows for the utilization of multiple cores in an efficient way.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127119828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Parallel architectures for the kNN classifier -- design of soft IP cores and FPGA implementations kNN分类器的并行架构——软IP核的设计和FPGA实现
ACM Trans. Embed. Comput. Syst. Pub Date : 2013-09-01 DOI: 10.1145/2514641.2514649
I. Stamoulias, E. Manolakos
{"title":"Parallel architectures for the kNN classifier -- design of soft IP cores and FPGA implementations","authors":"I. Stamoulias, E. Manolakos","doi":"10.1145/2514641.2514649","DOIUrl":"https://doi.org/10.1145/2514641.2514649","url":null,"abstract":"We designed a variety of k-nearest-neighbor parallel architectures for FPGAs in the form of parameterizable soft IP cores. We show that they can be used to solve large classification problems with thousands of training vectors, or thousands of vector dimensions using a single FPGA, and achieve very high throughput. They can be used to flexibly synthesize architectures that also cover: 1NN classification (vector quantization), multishot queries (with different k), LOOCV cross-validation, and compare favorably to GPU implementations. To the best of our knowledge this is the first attempt to design flexible IP cores for the popular kNN classifier.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114920776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Custom architecture for multicore audio beamforming systems 多核音频波束形成系统的自定义架构
ACM Trans. Embed. Comput. Syst. Pub Date : 2013-09-01 DOI: 10.1145/2514641.2514646
D. Theodoropoulos, G. Kuzmanov, G. Gaydadjiev
{"title":"Custom architecture for multicore audio beamforming systems","authors":"D. Theodoropoulos, G. Kuzmanov, G. Gaydadjiev","doi":"10.1145/2514641.2514646","DOIUrl":"https://doi.org/10.1145/2514641.2514646","url":null,"abstract":"The audio Beamforming (BF) technique utilizes microphone arrays to extract acoustic sources recorded in a noisy environment. In this article, we propose a new approach for rapid development of multicore BF systems. Research on literature reveals that the majority of such experimental and commercial audio systems are based on desktop PCs, due to their high-level programming support and potential of rapid system development. However, these approaches introduce performance bottlenecks, excessive power consumption, and increased overall cost. Systems based on DSPs require very low power, but their performance is still limited. Custom hardware solutions alleviate the aforementioned drawbacks, however, designers primarily focus on performance optimization without providing a high-level interface for system control and test. In order to address the aforementioned problems, we propose a custom platform-independent architecture for reconfigurable audio BF systems. To evaluate our proposal, we implement our architecture as a heterogeneous multicore reconfigurable processor and map it onto FPGAs. Our approach combines the software flexibility of General-Purpose Processors (GPPs) with the computational power of multicore platforms. In order to evaluate our system we compare it against a BF software application implemented to a low-power Atom 330, a middle-ranged Core2 Duo, and a high-end Core i3. Experimental results suggest that our proposed solution can extract up to 16 audio sources in real time under a 16-microphone setup. In contrast, under the same setup, the Atom 330 cannot extract any audio sources in real time, while the Core2 Duo and the Core i3 can process in real time only up to 4 and 6 sources respectively. Furthermore, a Virtex4-based BF system consumes more than an order less energy compared to the aforementioned GPP-based approaches.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"741 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116088515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Design-space exploration and runtime resource management for multicores 多核的设计空间探索和运行时资源管理
ACM Trans. Embed. Comput. Syst. Pub Date : 2013-09-01 DOI: 10.1145/2514641.2514647
Giovanni Mariani, G. Palermo, V. Zaccaria, C. Silvano
{"title":"Design-space exploration and runtime resource management for multicores","authors":"Giovanni Mariani, G. Palermo, V. Zaccaria, C. Silvano","doi":"10.1145/2514641.2514647","DOIUrl":"https://doi.org/10.1145/2514641.2514647","url":null,"abstract":"Application-specific multicore architectures are usually designed by using a configurable platform in which a set of parameters can be tuned to find the best trade-off in terms of the selected figures of merit (such as energy, delay, and area). This multi-objective optimization phase is called Design-Space Exploration (DSE). Among the design-time (hardware) configurable parameters we can find the memory subsystem configuration (such as cache size and associativity) and other architectural parameters such as the instruction-level parallelism of the system processors. Among the runtime (software) configurable parameters we can find the degree of task-level parallelism associated with each application running on the platform.\u0000 The contribution of this article is twofold; first, we introduce an evolutionary (NSGA-II-based) methodology for identifying a hardware configuration which is robust with respect to applications and corresponding datasets. Second, we introduce a novel runtime heuristic that exploits design-time identified operating points to provide guaranteed throughput to each application. Experimental results show that the design-time/runtime combined approach improves the runtime performance of the system with respect to existing reference techniques, while meeting the overall power budget.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124645594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Memory performance estimation of CUDA programs CUDA程序的内存性能估计
ACM Trans. Embed. Comput. Syst. Pub Date : 2013-09-01 DOI: 10.1145/2514641.2514648
Yooseong Kim, Aviral Shrivastava
{"title":"Memory performance estimation of CUDA programs","authors":"Yooseong Kim, Aviral Shrivastava","doi":"10.1145/2514641.2514648","DOIUrl":"https://doi.org/10.1145/2514641.2514648","url":null,"abstract":"CUDA has successfully popularized GPU computing, and GPGPU applications are now used in various embedded systems. The CUDA programming model provides a simple interface to program on GPUs, but tuning GPGPU applications for high performance is still quite challenging. Programmers need to consider numerous architectural details, and small changes in source code, especially on the memory access pattern, can affect performance significantly. This makes it very difficult to optimize CUDA programs. This article presents CuMAPz, which is a tool to analyze and compare the memory performance of CUDA programs. CuMAPz can help programmers explore different ways of using shared and global memories, and optimize their program for efficient memory behavior. CuMAPz models several memory-performance-related factors: data reuse, global memory access coalescing, global memory latency hiding, shared memory bank conflict, channel skew, and branch divergence. Experimental results show that CuMAPz can accurately estimate performance with correlation coefficient of 0.96. By using CuMAPz to explore the memory access design space, we could improve the performance of our benchmarks by 30% more than the previous approach [Hong and Kim 2010].","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"1632 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127446879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信