Proceedings of the 7th ACM international conference on Computing frontiers最新文献

筛选
英文 中文
Novel low-cost aging sensor 新型低成本老化传感器
Proceedings of the 7th ACM international conference on Computing frontiers Pub Date : 2010-05-17 DOI: 10.1145/1787275.1787299
M. Omaña, Daniele Rossi, N. Bosio, C. Metra
{"title":"Novel low-cost aging sensor","authors":"M. Omaña, Daniele Rossi, N. Bosio, C. Metra","doi":"10.1145/1787275.1787299","DOIUrl":"https://doi.org/10.1145/1787275.1787299","url":null,"abstract":"Performance degradation of integrated circuits due to aging effects, such as Negative Bias Temperature Instability (NBTI), is becoming of great concern for current and future CMOS technology. Here we introduce an aging sensor able to detect such degradations in the combinational part of a critical data-path. It requires lower area than recently proposed alternative solutions, and a lower or comparable power consumption.","PeriodicalId":151791,"journal":{"name":"Proceedings of the 7th ACM international conference on Computing frontiers","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131636147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
A heterogeneous parallel system running open mpi on a broadband network of embedded set-top devices 在嵌入式机顶机宽带网络上运行开放mpi的异构并行系统
Proceedings of the 7th ACM international conference on Computing frontiers Pub Date : 2010-05-17 DOI: 10.1145/1787275.1787324
Richard Neill, Alexander Shabarshin, L. Carloni
{"title":"A heterogeneous parallel system running open mpi on a broadband network of embedded set-top devices","authors":"Richard Neill, Alexander Shabarshin, L. Carloni","doi":"10.1145/1787275.1787324","DOIUrl":"https://doi.org/10.1145/1787275.1787324","url":null,"abstract":"We present a heterogeneous parallel computing system that combines a traditional computer cluster with a broadband network of embedded set-top box (STB) devices. As multiple service operators (MSO) manage millions of these devices across wide geographic areas, the computational power of such a massively-distributed embedded system could be harnessed to realize a centrally-managed, energy-efficient parallel processing platform that supports a variety of application domains which are of interest to MSOs, consumers, and the high-performance computing research community. We investigate the feasibility of this idea by building a prototype system that includes a complete head-end cable system with a DOCSIS-2.0 network combined with an interoperable implementation of a subset of Open MPI running on the STB embedded operating system. We evaluate the performance and scalability of our system compared to a traditional cluster by solving approximately various instances of the Multiple Sequence Alignment bioinformatics problem, while the STBs continue simultaneously to operate their primary functions: decode MPEG streams for television display and run an interactive user interface. Based on our experimental results and given the technology trends in embedded computing we argue that our approach to leverage a broadband network of embedded devices in a heterogeneous distributed system offers the benefits of both parallel computing clusters and distributed Internet computing.","PeriodicalId":151791,"journal":{"name":"Proceedings of the 7th ACM international conference on Computing frontiers","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114398654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Combining deblurring and denoising for handheld HDR imaging in low light conditions 结合弱光条件下手持式HDR成像的去模糊和去噪
Proceedings of the 7th ACM international conference on Computing frontiers Pub Date : 2010-05-17 DOI: 10.1145/1787275.1787303
P. Lakshman
{"title":"Combining deblurring and denoising for handheld HDR imaging in low light conditions","authors":"P. Lakshman","doi":"10.1145/1787275.1787303","DOIUrl":"https://doi.org/10.1145/1787275.1787303","url":null,"abstract":"This paper proposes a probability formulation that unifies both single-image deblurring and multi-image denoising using variational inference. Based on this formulation, a new algorithm for deblurring a noisy and blurry image pair is presented. Besides, we provide also an approach that combines existing optical flow and image denoising techniques for High Dynamic Range imaging.","PeriodicalId":151791,"journal":{"name":"Proceedings of the 7th ACM international conference on Computing frontiers","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114883821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Session details: Caches and branches 2 会话细节:缓存和分支
C. Trinitis
{"title":"Session details: Caches and branches 2","authors":"C. Trinitis","doi":"10.1145/3251913","DOIUrl":"https://doi.org/10.1145/3251913","url":null,"abstract":"","PeriodicalId":151791,"journal":{"name":"Proceedings of the 7th ACM international conference on Computing frontiers","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122880814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session details: Keynote 会议详情:
G. Bilardi
{"title":"Session details: Keynote","authors":"G. Bilardi","doi":"10.1145/3251918","DOIUrl":"https://doi.org/10.1145/3251918","url":null,"abstract":"","PeriodicalId":151791,"journal":{"name":"Proceedings of the 7th ACM international conference on Computing frontiers","volume":"14 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116859043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On-chip communication and synchronization mechanisms with cache-integrated network interfaces 片上通信和同步机制与缓存集成的网络接口
Proceedings of the 7th ACM international conference on Computing frontiers Pub Date : 2010-05-17 DOI: 10.1145/1787275.1787328
S. Kavadias, M. Katevenis, M. Zampetakis, Dimitrios S. Nikolopoulos
{"title":"On-chip communication and synchronization mechanisms with cache-integrated network interfaces","authors":"S. Kavadias, M. Katevenis, M. Zampetakis, Dimitrios S. Nikolopoulos","doi":"10.1145/1787275.1787328","DOIUrl":"https://doi.org/10.1145/1787275.1787328","url":null,"abstract":"Per-core local (scratchpad) memories allow direct inter-core communication, with latency and energy advantages over coherent cache-based communication, especially as CMP architectures become more distributed. We have designed cache-integrated network interfaces (NIs), appropriate for scalable multicores, that combine the best of two worlds the flexibility of caches and the efficiency of scratchpad memories: on-chip SRAM is configurably shared among caching, scratchpad, and virtualized NI functions. This paper presents our architecture, which provides local and remote scratchpad access, to either individual words or multi-word blocks through RDMA copy. Furthermore, we introduce event responses, as a mechanism for software configurable synchronization primitives. We present three event response mechanisms that expose NI functionality to software, for multiword transfer initiation, memory barriers for explicitly-selected accesses of arbitrary size, and multi-party synchronization queues. We implemented these mechanisms in a four-core FPGA prototype, and evaluated the on-chip communication performance on the prototype as well as on a CMP simulator with up to 128 cores. We demonstrate efficient synchronization, low-overhead communication, and amortized-overhead bulk transfers, which allow parallelization gains for fine-grain tasks, and efficient exploitation of the hardware bandwidth.","PeriodicalId":151791,"journal":{"name":"Proceedings of the 7th ACM international conference on Computing frontiers","volume":"200 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134041697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
Interval-based models for run-time DVFS orchestration in superscalar processors 超标量处理器中运行时DVFS编排的基于间隔的模型
Proceedings of the 7th ACM international conference on Computing frontiers Pub Date : 2010-05-17 DOI: 10.1145/1787275.1787338
G. Keramidas, Vasileios Spiliopoulos, S. Kaxiras
{"title":"Interval-based models for run-time DVFS orchestration in superscalar processors","authors":"G. Keramidas, Vasileios Spiliopoulos, S. Kaxiras","doi":"10.1145/1787275.1787338","DOIUrl":"https://doi.org/10.1145/1787275.1787338","url":null,"abstract":"We develop two simple interval-based models for dynamic superscalar processors. These models allow us to: i) predict with great accuracy performance and power consumption under various frequency and voltage combinations and ii) implement targeted DVFS policies at run-time. The models analyze program execution in intervals - steady-state and miss-event intervals. Intervals are signalled by miss events (L2-misses in our case) that upset the \"steady state\" execution of the program and are ended when the pipeline reaches again a steady state. The first model is fed by an approximation of the stall cycles (the time the processor instruction window is blocked) due to long-latency L2-misses. The second model improves on this approximation using as input the occupancy of the L2's miss-handling registers (MSHRs). Despite their simplicity these models prove to be accurate in predicting the performance (and energy) for any target frequency/voltage setting, yielding average errors of 2.1% and 0.2% respectively. Besides modelling, we show that the methodology we propose is powerful enough to implement (at run-time) various DVFS policies: \"operate at optimal EDP\" or \"ED2P,\" or even \"reduce ED2P within specific performance constraints.\" Approaches based on the two models require minimal hardware cost: two counters for measuring the duration of the steady state and the miss-event intervals and some comparison logic. To validate our methodology we use a cycle-accurate simulator and the benchmarks provided by the SPEC2K suite. Our results indicate that our proposed run-time mechanism is able to orchestrate different DVFS policies with great success yielding negligible errors - bellow 1.5% on average.","PeriodicalId":151791,"journal":{"name":"Proceedings of the 7th ACM international conference on Computing frontiers","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115444214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 93
Hybrid parallel programming with MPI and unified parallel C 基于MPI和统一并行C语言的混合并行编程
Proceedings of the 7th ACM international conference on Computing frontiers Pub Date : 2010-05-17 DOI: 10.1145/1787275.1787323
James Dinan, P. Balaji, E. Lusk, P. Sadayappan, R. Thakur
{"title":"Hybrid parallel programming with MPI and unified parallel C","authors":"James Dinan, P. Balaji, E. Lusk, P. Sadayappan, R. Thakur","doi":"10.1145/1787275.1787323","DOIUrl":"https://doi.org/10.1145/1787275.1787323","url":null,"abstract":"The Message Passing Interface (MPI) is one of the most widely used programming models for parallel computing. However, the amount of memory available to an MPI process is limited by the amount of local memory within a compute node. Partitioned Global Address Space (PGAS) models such as Unified Parallel C (UPC) are growing in popularity because of their ability to provide a shared global address space that spans the memories of multiple compute nodes. However, taking advantage of UPC can require a large recoding effort for existing parallel applications. In this paper, we explore a new hybrid parallel programming model that combines MPI and UPC. This model allows MPI programmers incremental access to a greater amount of memory, enabling memory-constrained MPI codes to process larger data sets. In addition, the hybrid model offers UPC programmers an opportunity to create static UPC groups that are connected over MPI. As we demonstrate, the use of such groups can significantly improve the scalability of locality-constrained UPC codes. This paper presents a detailed description of the hybrid model and demonstrates its effectiveness in two applications: a random access benchmark and the Barnes-Hut cosmological simulation. Experimental results indicate that the hybrid model can greatly enhance performance; using hybrid UPC groups that span two cluster nodes, RA performance increases by a factor of 1.33 and using groups that span four cluster nodes, Barnes-Hut experiences a twofold speedup at the expense of a 2% increase in code size.","PeriodicalId":151791,"journal":{"name":"Proceedings of the 7th ACM international conference on Computing frontiers","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117074908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 56
Global management of cache hierarchies 缓存层次结构的全局管理
Proceedings of the 7th ACM international conference on Computing frontiers Pub Date : 2010-05-17 DOI: 10.1145/1787275.1787315
M. Zahran, S. Mckee
{"title":"Global management of cache hierarchies","authors":"M. Zahran, S. Mckee","doi":"10.1145/1787275.1787315","DOIUrl":"https://doi.org/10.1145/1787275.1787315","url":null,"abstract":"Cache memories currently treat all blocks as if they were equally important. This assumption of equally important blocks is not always valid. For instance, not all blocks deserve to be in L1 cache. We therefore propose globalized block placement. We present a global placement algorithm for managing blocks in a cache hierarchy by deciding where in the hierarchy an incoming block should be placed. Our technique makes decisions by adapting to access patterns of different blocks. The contributions of this paper are fourfold. First, we motivate our solution by demonstrating the importance of a globalized placement scheme. Second, we present a method to categorize cache block behavior into one of four categories. Third, we present one potential design exploiting this categorization. Finally, we demonstrate the performance of our design. The proposed scheme enhances overall system performance (IPC) by an average of 12% over a traditional LRU scheme while reducing traffic between L1 cache and L2 cache by an average of 20%, using SPEC CPU benchmark suite. All of this is achieved with a table as small as 3 KB.","PeriodicalId":151791,"journal":{"name":"Proceedings of the 7th ACM international conference on Computing frontiers","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116920514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
SpiNNaker: impact of traffic locality, causality and burstiness on the performance of the interconnection network 三角帆:业务局部性、因果性和突发性对互联网络性能的影响
Proceedings of the 7th ACM international conference on Computing frontiers Pub Date : 2010-05-17 DOI: 10.1145/1787275.1787278
J. Navaridas, L. Plana, J. Miguel-Alonso, M. Luján, S. Furber
{"title":"SpiNNaker: impact of traffic locality, causality and burstiness on the performance of the interconnection network","authors":"J. Navaridas, L. Plana, J. Miguel-Alonso, M. Luján, S. Furber","doi":"10.1145/1787275.1787278","DOIUrl":"https://doi.org/10.1145/1787275.1787278","url":null,"abstract":"The SpiNNaker system is a biologically-inspired massively parallel architecture of bespoke multi-core System-on-Chips. The aim of its design is to simulate up to a billion spiking neurons in (biological) real-time. Packets, in SpiNNaker, represent neural spikes and these travel through the two-dimensional triangular torus network that connects the over 65 thousand nodes housed in the largest size of SpiNNaker. The research question that we explore is the impact that spatial locality, temporal causality and burstiness of the traffic have on the performance of such interconnection network. Given the limited knowledge of neuron activity patterns, we propose and use synthetic traffic patterns which resemble biological neural traffic and allow tuning of spatial locality. Causality is explored by means of temporal patterns that maintain a specified overall network load while allowing at the node level autonomous causal traffic generation. Part of the traffic is generated automatically, but the remaining traffic is triggered by a spike arrival in the form of a packet or a burst of packets; as neural stimuli do. In this way, we generate non-uniform traffic patterns with an evolving concentration of activity at nodes which contain more active parts of the spiking neural network. Given the application domain, the simulation-based study focuses on the real-time behavior of the system rather than focusing on standard HPC network metrics. The results show that the interconnection network of SpiNNaker can operate without dropping packets with traffic loads that exceed more than 3.5 times those required to simulate 109 spiking neurons, despite using non-local traffic. We also find that increments in the degree of traffic causality do not affect the performance of the system, but burstiness in the traffic can hurt performance.","PeriodicalId":151791,"journal":{"name":"Proceedings of the 7th ACM international conference on Computing frontiers","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129961487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信