16th Symposium on Computer Architecture and High Performance Computing最新文献

筛选
英文 中文
Design space exploration using T&D-Bench 使用T&D-Bench设计空间探索
16th Symposium on Computer Architecture and High Performance Computing Pub Date : 2004-10-27 DOI: 10.1109/CAHPC.2004.16
S. Soares, F. Wagner
{"title":"Design space exploration using T&D-Bench","authors":"S. Soares, F. Wagner","doi":"10.1109/CAHPC.2004.16","DOIUrl":"https://doi.org/10.1109/CAHPC.2004.16","url":null,"abstract":"This paper presents T&D-Bench - teaching and design workbench, a software infrastructure for modeling and simulation of state-of-the-art processors. It combines features that simplify and accelerate the processor design process without restricting the designer possibilities, thus representing a good tradeoff for educational and research purposes that is not found in other environments. In T&D-Bench, a new model is constructed by the designer using script language to define microarchitecture, instruction set, and timing aspects of the processor. These scripts can be produced by a graphical front-end, and a Java simulator targeted at the modeled processor is automatically built from the scripts. This approach can fit well the requirements imposed by the educational environment. Fine-tuning adjustments or the description of more complex processor mechanisms can be achieved by means of modifications in selected parts of the software infrastructure.","PeriodicalId":375288,"journal":{"name":"16th Symposium on Computer Architecture and High Performance Computing","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127479341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
High performance communication system based on generic programming 基于泛型编程的高性能通信系统
16th Symposium on Computer Architecture and High Performance Computing Pub Date : 2004-10-27 DOI: 10.1109/CAHPC.2004.19
A.L.G. Sanches, F. R. Secco, A. A. Fröhlich
{"title":"High performance communication system based on generic programming","authors":"A.L.G. Sanches, F. R. Secco, A. A. Fröhlich","doi":"10.1109/CAHPC.2004.19","DOIUrl":"https://doi.org/10.1109/CAHPC.2004.19","url":null,"abstract":"This paper presents a high performance communication system based on generic programming. The system adapts itself according to the protocol being used on communication, simplifying the development of libraries. In order to validate the concepts, a MPI implementation has been developed and it is compared to a traditional implementation - MPICH-GM. It is demonstrated that the same functionality and interface can be offered with similar performance, but with much less programming effort. That is evidence that the large size of traditional MPI implementations is due to the limitations of conventional communication systems.","PeriodicalId":375288,"journal":{"name":"16th Symposium on Computer Architecture and High Performance Computing","volume":"9 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127501481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Characterizing the dynamic behavior of workload execution in SVM systems 支持向量机系统中工作负载执行的动态行为特征
16th Symposium on Computer Architecture and High Performance Computing Pub Date : 2004-10-27 DOI: 10.1109/CAHPC.2004.12
S. Petit, J. Sahuquillo, A. Pont, D. Kaeli
{"title":"Characterizing the dynamic behavior of workload execution in SVM systems","authors":"S. Petit, J. Sahuquillo, A. Pont, D. Kaeli","doi":"10.1109/CAHPC.2004.12","DOIUrl":"https://doi.org/10.1109/CAHPC.2004.12","url":null,"abstract":"The overhead associated with software management of shared virtual memory (SVM) systems can seriously impact overall system performance. One way to remedy this situation is to design more efficient SVM consistency protocols. In this paper we study a number of parallel workload characteristics that can negatively impact the performance of SVM systems. We attempt to quantify the sources of performance loss in some parallel workloads. Our goal is to better understand these characteristics, enabling us to develop SVM protocols that can adjust to dynamics in workload behavior. This paper has three main contributions: i) we measure the contention for synchronization resources, showing how applications exhibit distinct phases during their execution, ii) we quantify the relationship between page size and fragmentation/false sharing while varying the sharing unit size, and iii) we study the synergies between the contention for synchronization resources and fragmentation/false sharing, providing hints for developing improved protocols.","PeriodicalId":375288,"journal":{"name":"16th Symposium on Computer Architecture and High Performance Computing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122176999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Improving server performance on transaction processing workloads by enhanced data placement 通过增强数据放置,提高事务处理工作负载上的服务器性能
16th Symposium on Computer Architecture and High Performance Computing Pub Date : 2004-10-27 DOI: 10.1109/CAHPC.2004.22
J. Rubio, C. Lefurgy, L. John
{"title":"Improving server performance on transaction processing workloads by enhanced data placement","authors":"J. Rubio, C. Lefurgy, L. John","doi":"10.1109/CAHPC.2004.22","DOIUrl":"https://doi.org/10.1109/CAHPC.2004.22","url":null,"abstract":"Modern servers access large volumes of data while running commercial workloads. The data is typically spread among several storage devices (e.g. disks). Carefully placing the data across the storage devices can minimize costly remote accesses and improve performance. We propose the use of simulated annealing to arrive at an effective layout of data on disk. The proposed technique considers the configuration of the system and the cost of data movement. An initial layout globally optimized across all queries, shows speedups of up to 13% for a group of DSS queries and up to 6% for selected OLTP queries. This technique can be re-applied at run-time to further improve performance beyond the initial, globally optimized data layout. This scheme monitors architecture parameters to prevent optimizations of multiple operations to conflict with each other. Such a dynamic reorganization results in speedups of up to 23% for the DSS queries and up to 10% for the OLTP queries.","PeriodicalId":375288,"journal":{"name":"16th Symposium on Computer Architecture and High Performance Computing","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130624206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A performance evaluation of a quorum-based state-machine replication algorithm for computing grids 基于群体的网格计算状态机复制算法的性能评估
16th Symposium on Computer Architecture and High Performance Computing Pub Date : 2004-10-27 DOI: 10.1109/CAHPC.2004.4
Jean-Michel Busca, M. Bertier, F. Belkouch, Pierre Sens, L. Arantes
{"title":"A performance evaluation of a quorum-based state-machine replication algorithm for computing grids","authors":"Jean-Michel Busca, M. Bertier, F. Belkouch, Pierre Sens, L. Arantes","doi":"10.1109/CAHPC.2004.4","DOIUrl":"https://doi.org/10.1109/CAHPC.2004.4","url":null,"abstract":"Quorum systems are well-known tools that improve the performance and the availability of distributed systems. In this paper we explore their use as a means to achieve low response time for network services that are replicated and accessed over computing grids. To that end, we propose both a quorum construction and a quorum-based state-machine replication algorithm that tolerates crash failures in a partially synchronous model. We show through the evaluation of a real implementation that although simple, this quorum construction and replication algorithm exhibits a response time 20% lower than that of a regular active replication algorithm in appropriate conditions.","PeriodicalId":375288,"journal":{"name":"16th Symposium on Computer Architecture and High Performance Computing","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114762876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Graph partitioning with the Party library: helpful-sets in practice 使用Party库进行图划分:实践中的有用集
16th Symposium on Computer Architecture and High Performance Computing Pub Date : 2004-10-27 DOI: 10.1109/CAHPC.2004.18
B. Monien, Stefan Schamberger
{"title":"Graph partitioning with the Party library: helpful-sets in practice","authors":"B. Monien, Stefan Schamberger","doi":"10.1109/CAHPC.2004.18","DOIUrl":"https://doi.org/10.1109/CAHPC.2004.18","url":null,"abstract":"Graph partitioning is an important subproblem in many applications. To partition a graph into more than two parts, there exist two different commonly used approaches: Either the graph is partitioned directly into the desired amount of partitions or the graph is first split into two partitions that are then further divided recursively. It has been shown that even optimal recursive bisection can lead to solutions \"very far from the optimal one\". However, for \"important graph classes\" recursive bisection solutions are known to be \"almost always\" within a constant factor of the optimal one. Thus, the question arises how good recursive bisection performs in practice. In this paper we describe enhancements to the Party graph partitioning library which is based on the helpful-set bisection heuristic and present results of extensive tests undertaken with it. We thereby compare Party with the two state-of-the art libraries Metis and Jostle using a permutation based evaluation scheme. We show experimentally that there are indeed many cases where a recursive application of a good bisection heuristic is likely to find better solutions than up-to-date direct approaches.","PeriodicalId":375288,"journal":{"name":"16th Symposium on Computer Architecture and High Performance Computing","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131574106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
A parallel engine for graphical interactive molecular dynamics simulations 一个用于图形交互分子动力学模拟的并行引擎
16th Symposium on Computer Architecture and High Performance Computing Pub Date : 2004-10-27 DOI: 10.1109/CAHPC.2004.3
E. Rodrigues, A. J. Preto, S. Stephany
{"title":"A parallel engine for graphical interactive molecular dynamics simulations","authors":"E. Rodrigues, A. J. Preto, S. Stephany","doi":"10.1109/CAHPC.2004.3","DOIUrl":"https://doi.org/10.1109/CAHPC.2004.3","url":null,"abstract":"The current work proposes a parallel implementation for interactive molecular dynamics simulations (MD). The interactive capability is modeled by finite automata that are executed in the processing nodes. Any interaction implies in a communication between the user interface and the finite automata. The ADKS, an interactive sequential MD code that provides graphical output was chosen as a case study. A parallel version of this code was developed using the MPI communication library to check its parallel performance without/with visualization. Performance results are discussed for both cases and the influence of visualization in the performance is also treated, including image update rate. In order to allow a modular approach, a new parallel version of the ADKS is being implemented employing the PyMPI Python extension.","PeriodicalId":375288,"journal":{"name":"16th Symposium on Computer Architecture and High Performance Computing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117117839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Multi-profile instruction based compression 基于多轮廓指令的压缩
16th Symposium on Computer Architecture and High Performance Computing Pub Date : 2004-10-27 DOI: 10.1109/CAHPC.2004.26
E. W. Netto, R. Azevedo, P. Centoducatte, G. Araújo
{"title":"Multi-profile instruction based compression","authors":"E. W. Netto, R. Azevedo, P. Centoducatte, G. Araújo","doi":"10.1109/CAHPC.2004.26","DOIUrl":"https://doi.org/10.1109/CAHPC.2004.26","url":null,"abstract":"Code compression has been used to minimize the memory area requirement of embedded systems. Recently, performance improvement and energy consumption reduction are observed as a by-product of compression. In this paper we propose a novel technique for efficiently exploring the trade-offs involved in code compression. Our multiprofile approach to build dictionaries combines the best features of both static and dynamic program behaviors. The experiments with Mediabench and MiBench suites and the Leon (SPARCv8) processor reveal a compression ratio as low as 71% while performance speed-up reaches 1.5.","PeriodicalId":375288,"journal":{"name":"16th Symposium on Computer Architecture and High Performance Computing","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121965285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the combined scheduling of malleable and rigid jobs 柔性作业与刚性作业的联合调度研究
16th Symposium on Computer Architecture and High Performance Computing Pub Date : 2004-10-27 DOI: 10.1109/SBAC-PAD.2004.27
J. Hungershofer
{"title":"On the combined scheduling of malleable and rigid jobs","authors":"J. Hungershofer","doi":"10.1109/SBAC-PAD.2004.27","DOIUrl":"https://doi.org/10.1109/SBAC-PAD.2004.27","url":null,"abstract":"The demand of the users of parallel systems for low response times contradicts the ambition of the system maintainers for a high utilization. A high utilization normally results in long waiting times for the users' jobs. To fullfil the concerns of both interest groups is a hard job to do. The usage of more flexible jobs models can be a way out of the dilemma. These models allow jobs to change their width at application start (moldable jobs) or even during execution (malleable jobs). We have analyzed the quality of schedules using job sets with moldable and malleable jobs and combinations of both. Tracefiles from supercomputer installations have been modified to contain varying fractions of moldable and malleable jobs. Using a special simulation environment for the more flexible job models the jobs have been scheduled virtually. The results show that both interest groups mentioned above can be pleased if these job models are used and the average response times become significantly better.","PeriodicalId":375288,"journal":{"name":"16th Symposium on Computer Architecture and High Performance Computing","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129890300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A cluster-based strategy for scheduling task on heterogeneous processors 异构处理器上基于集群的任务调度策略
16th Symposium on Computer Architecture and High Performance Computing Pub Date : 2004-10-27 DOI: 10.1109/SBAC-PAD.2004.1
Cristina Boeres, J. V. Filho, Vinod E. F. Rebello
{"title":"A cluster-based strategy for scheduling task on heterogeneous processors","authors":"Cristina Boeres, J. V. Filho, Vinod E. F. Rebello","doi":"10.1109/SBAC-PAD.2004.1","DOIUrl":"https://doi.org/10.1109/SBAC-PAD.2004.1","url":null,"abstract":"Efficient task scheduling is fundamental for parallel applications to achieve good performance on distributed systems. While extensive work exists for scheduling tasks on homogeneous processors, fewer algorithms exist for the more common problem of scheduling in heterogeneous processor environments. In this paper, we propose coupling a replication-based clustering heuristic for homogeneous processors, with a mechanism to map the generated clusters to the heterogeneous environment. Experimental results show that this strategy compares favourably in terms of the makespan with traditional list scheduling approaches to this problem, particularly when communication costs are high.","PeriodicalId":375288,"journal":{"name":"16th Symposium on Computer Architecture and High Performance Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130063795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 87
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信