19th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'07)最新文献

筛选
英文 中文
Predicting Loop Termination to Boost Speculative Thread-Level Parallelism in Embedded Applications 预测循环终止以提高嵌入式应用的推测线程级并行性
Md. Mafijul Islam
{"title":"Predicting Loop Termination to Boost Speculative Thread-Level Parallelism in Embedded Applications","authors":"Md. Mafijul Islam","doi":"10.1109/SBAC-PAD.2007.23","DOIUrl":"https://doi.org/10.1109/SBAC-PAD.2007.23","url":null,"abstract":"The necessity of devising novel thread-level speculation (TLS) techniques has become extremely important with the growing acceptance of multi-core architectures by the industry. However, the achievable performance to commensurate the actual potential of TLS is limited by the thread-management overhead. In this paper, we have exploited the run-time behavior of the performance-critical loops to minimize such overhead to improve the performance using embedded applications. We have shown that an average speedup of 2.4 is achievable on a 4-way machine which supports TLS, but has no special mechanism to predict the loop trip count. Then we have augmented the machine with the perfect knowledge of the loop trip count and obtained an average speedup of 2.6. Finally, we have incorporated a simple stride predictor to predict the loop trip count dynamically. The proposed predictor has an average prediction accuracy of 96% and the machine then yields an average speedup of 2.5 for the chosen applications.","PeriodicalId":261956,"journal":{"name":"19th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'07)","volume":"176 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121402694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Node Level Primitives for Parallel Exact Inference 并行精确推理的节点级原语
Yinglong Xia, V. Prasanna
{"title":"Node Level Primitives for Parallel Exact Inference","authors":"Yinglong Xia, V. Prasanna","doi":"10.1109/SBAC-PAD.2007.18","DOIUrl":"https://doi.org/10.1109/SBAC-PAD.2007.18","url":null,"abstract":"We present node level primitives for parallel exact inference on an arbitrary Bayesian network. We explore the probability representation on each node of Bayesian networks and each clique of junction trees. We study the operations with respect to these probability representations and categorize the operations into four node level primitives: table extension, table multiplication, table division, and table marginalization. Exact inference on Bayesian networks can be implemented based on these node level primitives. We develop parallel algorithms for the above and achieve parallel computational complexity of O(omega2r(omega+1)N/p), O(Nromega) space complexity and scalability up to O(romega), where N is the number of cliques in the junction tree, r is the number of states of a random variable, w is the maximal size of the cliques, and p is the number of processors. Experimental results illustrate the scalability of our parallel algorithms for each of these primitives.","PeriodicalId":261956,"journal":{"name":"19th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'07)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115893868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
An Intelligent Mechanism to Explore a Two-Level Cache Hierarchy Considering Energy Consumption and Time Performance 一种考虑能耗和时间性能的两级缓存结构的智能探索机制
A. Silva-Filho, Carmelo J. A. Bastos Filho, Ricardo Massa Ferreira Lima, D. Falcão, F. Cordeiro, Marília P. Lima
{"title":"An Intelligent Mechanism to Explore a Two-Level Cache Hierarchy Considering Energy Consumption and Time Performance","authors":"A. Silva-Filho, Carmelo J. A. Bastos Filho, Ricardo Massa Ferreira Lima, D. Falcão, F. Cordeiro, Marília P. Lima","doi":"10.1109/SBAC-PAD.2007.14","DOIUrl":"https://doi.org/10.1109/SBAC-PAD.2007.14","url":null,"abstract":"Cache memory hierarchy contributes positively to system performance. Moreover, tuning cache architectures in platforms for embedded applications can dramatically reduce energy consumption. This paper presents an automated method for adjusting two-level cache memory hierarchy intended for data caches in order to reduce energy consumption and improve the performance of embedded applications. We propose an automated mechanism called TEMGA (Two-level cache Exploration Mechanism based on Genetic Algorithm), to determine the suitable cache hierarchy configuration by exploring a small part of search space. In our experiments, we applied the proposed mechanism to 12 different benchmarks from the MiBench suite. The results show an average reduction of about 15% in the energy consumption for data caches when compared to existing heuristics and a reduction of 5 times in the number of cycles needed to execute applications from Mibench Benchmark suite.","PeriodicalId":261956,"journal":{"name":"19th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'07)","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123247360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Automatic Constraint Partitioning to Speed Up CLP Execution 自动约束分区加速CLP执行
M. Pereira, P. Vargas, M. D. Castro, F. França, I. Dutra
{"title":"Automatic Constraint Partitioning to Speed Up CLP Execution","authors":"M. Pereira, P. Vargas, M. D. Castro, F. França, I. Dutra","doi":"10.1109/SBAC-PAD.2007.29","DOIUrl":"https://doi.org/10.1109/SBAC-PAD.2007.29","url":null,"abstract":"Speedup in distributed executions of constraint logic programming (CLP) applications are directed related to a good constraint partitioning algorithm. In this work we study different mechanisms to distribute constraints to processors based on straightforward mechanisms such as round-robin and block distribution, and on a more sophisticated automatic distribution method, grouping-sink, that takes into account the connectivity of the constraint network graph. This aims at reducing the communication overhead in distributed environments. Our results show that grouping-sink is, in general, the best alternative for partitioning constraints as it produces results as good or better than round-robin or blocks with low communication rate.","PeriodicalId":261956,"journal":{"name":"19th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'07)","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114864566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On-line Scheduling of MPI-2 Programs with Hierarchical Work Stealing 具有分层工作窃取的MPI-2程序在线调度
G. P. Pezzi, M. C. Cera, E. Mathias, N. Maillard, P. Navaux
{"title":"On-line Scheduling of MPI-2 Programs with Hierarchical Work Stealing","authors":"G. P. Pezzi, M. C. Cera, E. Mathias, N. Maillard, P. Navaux","doi":"10.1109/SBAC-PAD.2007.36","DOIUrl":"https://doi.org/10.1109/SBAC-PAD.2007.36","url":null,"abstract":"MPI (Message Passing Interface) is the de facto standard in High Performance Computing. By using some MPI- 2 new features, such as the dynamic creation of processes, it is possible to implement highly efficient parallel programs that can run on dynamic and/or heterogeneous resources, provided a good schedule of the processes can be computed at run-time. A classical solution to schedule parallel programs on-line is Work Stealing. However, its use with MPI- 2 is complicated by a restricted communication scheme between the processes: namely, spawned processes in MPI-2 can only communicate with their direct parents. This work presents an on-line scheduling algorithm, called Hierarchical Work Stealing, to obtain good load-balancing of MPI- 2 programs that follow a Divide & Conquer strategy. Experimental results are provided, based on a synthetic application, the N-Queens computation. The results show that the Hierarchical Work Stealing algorithm enables the use of MPI with high efficiency, even in parallel dynamic HPC platforms that are not as homogeneous as clusters.","PeriodicalId":261956,"journal":{"name":"19th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'07)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114335776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Self-Imposed Temporal Redundancy: An Efficient Technique to Enhance the Reliability of Pipelined Functional Units 自施加时间冗余:一种提高流水线功能单元可靠性的有效技术
E. Mizan, Tileli Amimeur, M. Jacome
{"title":"Self-Imposed Temporal Redundancy: An Efficient Technique to Enhance the Reliability of Pipelined Functional Units","authors":"E. Mizan, Tileli Amimeur, M. Jacome","doi":"10.1109/SBAC-PAD.2007.39","DOIUrl":"https://doi.org/10.1109/SBAC-PAD.2007.39","url":null,"abstract":"Temporal redundancy (TR) improves the reliability of computational functional units (FUs). However, it can guarantee detection of transient errors only, and may have a substantial power and area overhead. In this paper we present self-imposed temporal redundancy (SITR), a form of TR that can be applied to pipelined FUs and does not suffer from the aforementioned problems. A SITR-enhanced FU forces redundant computations to fire in consecutive cycles and requires a single additional cycle for the second computation and the comparison of the two results. We evaluate the power and area overhead of SITR and conclude that is always smaller than that of standard TR and that it does not depend on the FU complexity. We also use SITR to improve the reliability of the execution datapath of a simple out-of-order engine, typical of that used in high reliability embedded systems and future many-core architectures. Our simulations show that SITR outperforms TR, especially in FP applications. When the number of integer ALUs is larger than the machine width, the performance penalty of SITR is consistently less than 10%.","PeriodicalId":261956,"journal":{"name":"19th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'07)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125155466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Exigency-based real-time scheduling policy to provide absolute QoS for web services 基于紧急的实时调度策略,为web服务提供绝对的QoS
Lucas S. Casagrande, Rodrigo Fernandes de Mello, Ricardo Bertagna, J. A. A. Filho, Francisco José Monaco
{"title":"Exigency-based real-time scheduling policy to provide absolute QoS for web services","authors":"Lucas S. Casagrande, Rodrigo Fernandes de Mello, Ricardo Bertagna, J. A. A. Filho, Francisco José Monaco","doi":"10.1109/SBAC-PAD.2007.21","DOIUrl":"https://doi.org/10.1109/SBAC-PAD.2007.21","url":null,"abstract":"Telemedicine, distance learning and e-commerce applications impose time constraints directly related to the efficacy of their operations. In order to offer reliability levels capable of meeting such requirements, mechanisms to provide QoS have been widely employed, what motivates this work to propose, implement and validate a real-time scheduling policy for providing absolute QoS for web services. The policy, named Exigency-Based Scheduling (EBS), intends to fast serve the most urgent requests, without degrading the whole system service. The current approach is based on the real-time scheduling, low latency and feedback scheduling, allowing a balanced configuration by the quantification of the exigency imposed to the system by the service classes. The technique evaluation uses metrics proposed in the present work. Experimental results confirm improvements in terms of QoS and client satisfaction.","PeriodicalId":261956,"journal":{"name":"19th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'07)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126957538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
A Component-Oriented Support for Hierarchical MPI Programming on Multi-Cluster Grid Environments 多集群网格环境下面向组件的分层MPI编程支持
E. Mathias, F. Baude, Vincent Cavé, N. Maillard
{"title":"A Component-Oriented Support for Hierarchical MPI Programming on Multi-Cluster Grid Environments","authors":"E. Mathias, F. Baude, Vincent Cavé, N. Maillard","doi":"10.1109/SBAC-PAD.2007.37","DOIUrl":"https://doi.org/10.1109/SBAC-PAD.2007.37","url":null,"abstract":"In this paper, we present a proposal for hierarchical MPI programming through some intuitive extensions to the MPI standard that may help users to develop non-embarrassingly parallel grid applications in a topology- aware manner. Afterwards, we present the design of such a support based upon a component model suited to grid computing (the EU CoreGRID grid component model - GCM - and its implementation in the ProActive grid environment) to handle inter-cluster and group communications. The usage of such components to handle high-level data distribution, parallelism and synchronization seems to be the most adequate technology to support MPI primitives in multi-cluster grids as they provide a built-in support to the encapsulation of native code, collective interfaces, tunneling of communications and a hierarchical and adaptable structure. The preliminary results have shown that the overhead is not negligible, but within the expected range. However we can expect the benefits to applications to bypass the generated overhead.","PeriodicalId":261956,"journal":{"name":"19th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'07)","volume":"36 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131550211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Novel Algorithm for Indirect Reputation-Based Grid Resource Management 一种基于间接声誉的网格资源管理新算法
Javier Echaiz, Jorge Ardenghi, Guillermo R. Simari
{"title":"A Novel Algorithm for Indirect Reputation-Based Grid Resource Management","authors":"Javier Echaiz, Jorge Ardenghi, Guillermo R. Simari","doi":"10.1109/SBAC-PAD.2007.24","DOIUrl":"https://doi.org/10.1109/SBAC-PAD.2007.24","url":null,"abstract":"A computational grid is a distributed infrastructure that appears to the end user as one large computing resource across organization boundaries. Grid technologies enable large-scale sharing of resources within formal or informal consortia of individuals and/or institutions, usually called virtual organizations. In these settings, the discovery, characterization, management, and monitoring of resources, services, and computations can be challenging due to the considerable diversity, large numbers, dynamic behavior, and geographical distribution of the entities in which a user might be interested. Trust is one of the biggest concerns in the grid resource management field. Grid systems can employ reputation mechanisms in order to provide this essential trust, but not usually without incurring in certain additional costs that negate the potential performance gains offered by grid computing technologies. Moreover, current reputation mechanisms are not appropriate for resource management in large-scale systems. In this paper, we present a new reputation model for resource management based on a economy model. Also we demonstrate how it can by employed to add trust into algorithms for grid scheduling. Finally, we simulate the proposed resource management algorithm in order to verify its effectiveness.","PeriodicalId":261956,"journal":{"name":"19th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'07)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126019641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信