2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)最新文献

筛选
英文 中文
A novel model for system-level decision making with combined ASP and SMT solving 结合ASP和SMT求解的系统级决策新模型
2014 Design, Automation & Test in Europe Conference & Exhibition (DATE) Pub Date : 2014-03-24 DOI: 10.7873/DATE.2014.230
Alexander Biewer, J. Gladigau, C. Haubelt
{"title":"A novel model for system-level decision making with combined ASP and SMT solving","authors":"Alexander Biewer, J. Gladigau, C. Haubelt","doi":"10.7873/DATE.2014.230","DOIUrl":"https://doi.org/10.7873/DATE.2014.230","url":null,"abstract":"In this paper, we present a novel model enabling system-level decision making for time-triggered many-core architectures in automotive systems. The proposed application model includes shared data entities that need to be bound to memories during decision making. As a key enabler to our approach, we explicitly separate computation and shared memory communication over a network-on-chip (NoC). To deal with contention on a NoC, we model the necessary basis to implement a time-triggered schedule that guarantees freedom of interference. We compute fundamental design decisions, namely (a) spatial binding, (b) multi-hop routing, and (c) time-triggered scheduling, by a novel coupling of answer set programming (ASP) with satisfiability modulo theories (SMT) solvers. First results of an automotive case study demonstrate the applicability of our method for complex real-world applications.","PeriodicalId":6550,"journal":{"name":"2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"50 1","pages":"1-4"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82920719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Time-critical computing on a single-chip massively parallel processor 单片大规模并行处理器上的时间关键计算
2014 Design, Automation & Test in Europe Conference & Exhibition (DATE) Pub Date : 2014-03-24 DOI: 10.7873/DATE.2014.110
B. Dinechin, D. V. Amstel, Marc Poulhiès, Guillaume Lager
{"title":"Time-critical computing on a single-chip massively parallel processor","authors":"B. Dinechin, D. V. Amstel, Marc Poulhiès, Guillaume Lager","doi":"10.7873/DATE.2014.110","DOIUrl":"https://doi.org/10.7873/DATE.2014.110","url":null,"abstract":"The requirement of high performance computing at low power can be met by the parallel execution of an application on a possibly large number of programmable cores. However, the lack of accurate timing properties may prevent parallel execution from being applicable to time-critical applications. We illustrate how this problem has been addressed by suitably designing the architecture, implementation, and programming model, of the Kalray MPPA®-256 single-chip many-core processor. The MPPA® -256 (Multi-Purpose Processing Array) processor integrates 256 processing engine (PE) cores and 32 resource management (RM) cores on a single 28nm CMOS chip. These VLIW cores are distributed across 16 compute clusters and 4 I/O subsystems, each with a locally shared memory. On-chip communication and synchronization are supported by an explicitly addressed dual network-on-chip (NoC), with one node per compute cluster and 4 nodes per I/O subsystem. Off-chip interfaces include DDR, PCI and Ethernet, and a direct access to the NoC for low-latency processing of data streams. The key architectural features that support time-critical applications are timing compositional cores, independent memory banks inside the compute clusters, and the data NoC whose guaranteed services are determined by network calculus. The programming model provides communicators that effectively support distributed computing primitives such as remote writes, barrier synchronizations, active messages, and communication by sampling. POSIX time functions expose synchronous clocks inside compute clusters and mesosynchronous clocks across the MPPA®-256 processor.","PeriodicalId":6550,"journal":{"name":"2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"14 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89111735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 174
Energy-efficient scheduling for memory-intensive GPGPU workloads 高效调度内存密集型GPGPU工作负载
2014 Design, Automation & Test in Europe Conference & Exhibition (DATE) Pub Date : 2014-03-24 DOI: 10.7873/DATE.2014.032
Seokwoo Song, Minseok Lee, John Kim, Woong Seo, Yeon-Gon Cho, Soojung Ryu
{"title":"Energy-efficient scheduling for memory-intensive GPGPU workloads","authors":"Seokwoo Song, Minseok Lee, John Kim, Woong Seo, Yeon-Gon Cho, Soojung Ryu","doi":"10.7873/DATE.2014.032","DOIUrl":"https://doi.org/10.7873/DATE.2014.032","url":null,"abstract":"High performance for a GPGPU workload is obtained by maximizing parallelism and fully utilizing the available resources. However, this is not necessarily energy efficient, especially for memory-intensive GPGPU workloads. In this work, we propose Throttle CTA (cooperative-thread array) Scheduling (TCS) where we leverage two type of throttling - throttling the number of actives cores and throttling of warp execution in the cores - to improve energy-efficiency for memory-intensive GPGPU workloads. The algorithm requires the global CTA or thread block scheduler to reduce the number of cores with assigned thread blocks while leveraging the local warp scheduler to throttle memory requests for some of the cores to further reduce power consumption. The proposed TCS scheduling does not require off-line analysis but can be done dynamically during execution. Instead of relying on conventional metrics such as miss-per-kilo-instruction (MPKI), we leverage the memory access latency metric to determine the memory intensity of the workloads. Our evaluations show that TCS reduces energy by up to 48% (38% on average) across different memory-intensive workload while having very little impact on performance for compute-intensive workloads.","PeriodicalId":6550,"journal":{"name":"2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"61 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83120904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Improving STT-MRAM density through multibit error correction 通过多比特纠错提高STT-MRAM密度
2014 Design, Automation & Test in Europe Conference & Exhibition (DATE) Pub Date : 2014-03-24 DOI: 10.7873/DATE2014.195
Brandon Del Bel, Jongyeon Kim, C. Kim, S. Sapatnekar
{"title":"Improving STT-MRAM density through multibit error correction","authors":"Brandon Del Bel, Jongyeon Kim, C. Kim, S. Sapatnekar","doi":"10.7873/DATE2014.195","DOIUrl":"https://doi.org/10.7873/DATE2014.195","url":null,"abstract":"STT-MRAMs are prone to data corruption due to inadvertent bit flips. Traditional methods enhance robustness at the cost of area/energy by using larger cell sizes to improve the thermal stability of the MTJ cells. This paper employs multibit error correction with DRAM-style refreshing to mitigate errors and provides a methodology for determining the optimal level of correction. A detailed analysis demonstrates that the reduction in nonvolatility requirements afforded by strong error correction translates to significantly lower area for the memory array compared to simpler ECC schemes, even when accounting for the increased overhead of error correction.","PeriodicalId":6550,"journal":{"name":"2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"11 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84687962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 54
Resolving the memory bottleneck for single supply near-threshold computing 解决单电源近阈值计算的内存瓶颈问题
2014 Design, Automation & Test in Europe Conference & Exhibition (DATE) Pub Date : 2014-03-24 DOI: 10.7873/DATE.2014.215
T. Gemmeke, M. Sabry, J. Stuijt, P. Raghavan, F. Catthoor, David Atienza Alonso
{"title":"Resolving the memory bottleneck for single supply near-threshold computing","authors":"T. Gemmeke, M. Sabry, J. Stuijt, P. Raghavan, F. Catthoor, David Atienza Alonso","doi":"10.7873/DATE.2014.215","DOIUrl":"https://doi.org/10.7873/DATE.2014.215","url":null,"abstract":"This paper focuses on a review of state-of-the-art memory designs and new design methods for near-threshold computing (NTC). In particular, it presents new ways to design reliable low-voltage NTC memories cost-effectively by reusing available cell libraries, or by adding a digital wrapper around existing commercially available memories. The approach is based on modeling at system level supported by silicon measurement on a test chip in a 40nm low-power processing technology. Advanced monitoring, control and run-time error mitigation schemes enable the operation of these memories at the same optimal near-Vt voltage level as the digital logic. Reliability degradation is thus overcome and this opens the way to solve the memory bottleneck in NTC systems. Starting from the available 40 nm silicon measurements, the analysis is extended to future 14 and 10 nm technology nodes.","PeriodicalId":6550,"journal":{"name":"2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"44 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89313148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Energy efficient data flow transformation for Givens Rotation based QR Decomposition 基于给定旋转QR分解的高能效数据流转换
2014 Design, Automation & Test in Europe Conference & Exhibition (DATE) Pub Date : 2014-03-24 DOI: 10.7873/DATE.2014.224
Namita Sharma, P. Panda, Min Li, Prashant Agrawal, F. Catthoor
{"title":"Energy efficient data flow transformation for Givens Rotation based QR Decomposition","authors":"Namita Sharma, P. Panda, Min Li, Prashant Agrawal, F. Catthoor","doi":"10.7873/DATE.2014.224","DOIUrl":"https://doi.org/10.7873/DATE.2014.224","url":null,"abstract":"QR Decomposition (QRD) is a typical matrix decomposition algorithm that shares many common features with other algorithms such as LU and Cholesky decomposition. The principle can be realized in a large number of valid processing sequences that differ significantly in the number of memory accesses and computations, and hence, the overall implementation energy. With modern low power embedded processors evolving towards register files with wide memory interfaces and vector functional units (FUs), the data flow in matrix decomposition algorithms needs to be carefully devised to achieve energy efficient implementation. In this paper, we present an efficient data flow transformation strategy for the Givens Rotation based QRD that optimizes data memory accesses. We also explore different possible implementations for QRD of multiple matrices using the SIMD feature of the processor. With the proposed data flow transformation, a reduction of up to 36% is achieved in the overall energy over conventional QRD sequences.","PeriodicalId":6550,"journal":{"name":"2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"51 1","pages":"1-4"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87391018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Spatial pattern prediction based management of faulty data caches 基于空间模式预测的故障数据缓存管理
2014 Design, Automation & Test in Europe Conference & Exhibition (DATE) Pub Date : 2014-03-24 DOI: 10.7873/DATE.2014.073
G. Keramidas, Michail Mavropoulos, Anna Karvouniari, D. Nikolos
{"title":"Spatial pattern prediction based management of faulty data caches","authors":"G. Keramidas, Michail Mavropoulos, Anna Karvouniari, D. Nikolos","doi":"10.7873/DATE.2014.073","DOIUrl":"https://doi.org/10.7873/DATE.2014.073","url":null,"abstract":"Technology scaling leads to significant faulty bit rates in on-chip caches. In this work, we propose a methodology to mitigate the impact of defective bits (due to permanent faults) in first-level set-associative data caches. Our technique assumes that faulty caches are enhanced with the ability of disabling their defective parts at cache subblock granularity. Our experimental findings reveal that while the occurrence of hard-errors in faulty caches may have a significant impact in performance, a lot of room for improvement exists, if someone is able to take into account the spatial reuse patterns of the to-be-referenced blocks (not all the data fetched into the cache is accessed). To this end, we propose frugal PC-indexed spatial predictors (with very small storage requirements) to orchestrate the (re)placement decisions among the fully and partially unusable faulty blocks. Using cycle-accurate simulations, a wide range of scientific applications, and a plethora of cache fault maps, we showcase that our approach is able to offer significant benefits in cache performance.","PeriodicalId":6550,"journal":{"name":"2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"74 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87428631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Temperature aware energy-reliability trade-offs for mapping of throughput-constrained applications on multimedia MPSoCs 多媒体mpsoc上吞吐量受限应用映射的温度感知能源可靠性权衡
2014 Design, Automation & Test in Europe Conference & Exhibition (DATE) Pub Date : 2014-03-24 DOI: 10.7873/DATE.2014.115
Anup Das, Akash Kumar, B. Veeravalli
{"title":"Temperature aware energy-reliability trade-offs for mapping of throughput-constrained applications on multimedia MPSoCs","authors":"Anup Das, Akash Kumar, B. Veeravalli","doi":"10.7873/DATE.2014.115","DOIUrl":"https://doi.org/10.7873/DATE.2014.115","url":null,"abstract":"This paper proposes a design-time (offline) analysis technique to determine application task mapping and scheduling on a multiprocessor system and the voltage and frequency levels of all cores (offline DVFS) that minimize application computation and communication energy, simultaneously minimizing processor aging. The proposed technique incorporates (1) the effect of the voltage and frequency on the temperature of a core; (2) the effect of neighboring cores' voltage and frequency on the temperature (spatial effect); (3) pipelined execution and cyclic dependencies among tasks; and (4) the communication energy component which often constitutes a significant fraction of the total energy for multimedia applications. The temperature model proposed here can be easily integrated in the design space exploration for multiprocessor systems. Experiments conducted with MPEG-4 decoder on a real system demonstrate that the temperature using the proposed model is within 5% of the actual temperature clearly demonstrating its accuracy. Further, the overall optimization technique achieves 40% savings in energy consumption with 6% increase in system lifetime.","PeriodicalId":6550,"journal":{"name":"2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"23 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85183763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
PUF modeling attacks: An introduction and overview PUF建模攻击:介绍和概述
2014 Design, Automation & Test in Europe Conference & Exhibition (DATE) Pub Date : 2014-03-24 DOI: 10.7873/DATE.2014.361
U. Rührmair, J. Sölter
{"title":"PUF modeling attacks: An introduction and overview","authors":"U. Rührmair, J. Sölter","doi":"10.7873/DATE.2014.361","DOIUrl":"https://doi.org/10.7873/DATE.2014.361","url":null,"abstract":"Machine learning (ML) based modeling attacks are the currently most relevant and effective attack form for so-called Strong Physical Unclonable Functions (Strong PUFs). We provide an overview of this method in this paper: We discuss (i) the basic conditions under which it is applicable; (ii) the ML algorithms that have been used in this context; (iii) the latest and most advanced results; (iv) the right interpretation of existing results; and (v) possible future research directions.","PeriodicalId":6550,"journal":{"name":"2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"1 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86969579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 91
Advanced system on a chip design based on controllable-polarity FETs 基于可控极性场效应管的先进片上系统设计
2014 Design, Automation & Test in Europe Conference & Exhibition (DATE) Pub Date : 2014-03-24 DOI: 10.7873/DATE.2014.248
P. Gaillardon, L. Amarù, Jian Zhang, G. Micheli
{"title":"Advanced system on a chip design based on controllable-polarity FETs","authors":"P. Gaillardon, L. Amarù, Jian Zhang, G. Micheli","doi":"10.7873/DATE.2014.248","DOIUrl":"https://doi.org/10.7873/DATE.2014.248","url":null,"abstract":"Field-Effect Transistors (FETs) with on-line controllable-polarity are promising candidates to support next generation System-on-Chip (SoC). Thanks to their enhanced functionality, controllable-polarity FETs enable a superior design of critical components in a SoC, such as processing units and memories, while also providing native solutions to control power consumption. In this paper, we present the efficient design of a SoC core with controllable-polarity FET. Processing units are speeded-up at the datapath level, as arithmetic operations require fewer physical resources than in standard CMOS. Power consumption is decreased via embedded power-gating techniques and tunable high-performance/low-power devices operation. Memory cells are made smaller by merging the access interface with the storage circuitry. We foresee the advantages deriving from these techniques, by evaluating their impact on the design of SoC for a contemporary telecommunication application. Using a 22-nm vertically-stacked silicon nanowire technology, a coarse-grain evaluation at the block level estimates a delay and power reduction of 20% and 19% respectively, at a cost of a moderate area overhead of 15%, with respect to a state-of-art FinFET technology.","PeriodicalId":6550,"journal":{"name":"2014 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"104 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83627578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信