2008 37th International Conference on Parallel Processing最新文献_第4页

An Efficient Parallel Algorithm for the Multiple Longest Common Subsequence (MLCS) Problem 多最长公共子序列(MLCS)问题的一种高效并行算法

2008 37th International Conference on Parallel Processing Pub Date : 2008-09-09 DOI: 10.1109/ICPP.2008.79

Dmitry Korkin, Qingguo Wang, Yi Shang

{"title":"An Efficient Parallel Algorithm for the Multiple Longest Common Subsequence (MLCS) Problem","authors":"Dmitry Korkin, Qingguo Wang, Yi Shang","doi":"10.1109/ICPP.2008.79","DOIUrl":"https://doi.org/10.1109/ICPP.2008.79","url":null,"abstract":"Finding the multiple longest common subsequence (MLCS) is an important problem in the areas of bioinformatics and computational genomics. Approaches that are more efficient than the standard dynamic programming method have been introduced and successfully parallelized for the special cases of 2 sequences. However, the increasing complexity and size of biological data require an efficient method applicable to an arbitrary number of sequences as well as its efficient parallelization. A recently developed dominant points method for a general MLCS problem has been shown a significant performance improvement over the dynamic programming method, when number of sequences is larger than two. At the same time, the approach has revealed strong demand for its parallelization, in order to be applied to the larger families of sequences or sequences of the greater lengths. In this paper, we introduce an efficient parallel algorithm to find a MLCS for an arbitrary number of sequences, which is based on the dominant points method. When the number of processors is not greater than the size of alphabet multiplied by the number of sequences, the parallel algorithm is estimated to have the asymptotically linear speed up. We experimentally tested the algorithm using sets of randomly generated sequences over different alphabets as well as the protein sequences from a family of homologous proteins. We found that the performance of the algorithm increases with the number of input sequences and reaches a near-linear speedup for eight sequences.","PeriodicalId":388408,"journal":{"name":"2008 37th International Conference on Parallel Processing","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132787201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 26

Designing an Efficient Kernel-Level and User-Level Hybrid Approach for MPI Intra-Node Communication on Multi-Core Systems 多核系统中MPI节点内通信的高效内核级和用户级混合方法设计

2008 37th International Conference on Parallel Processing Pub Date : 2008-09-09 DOI: 10.1109/ICPP.2008.16

Lei Chai, P. Lai, Hyun-Wook Jin, D. Panda

引用次数: 33

Scalable Techniques for Transparent Privatization in Software Transactional Memory 软件事务性内存中透明私有化的可伸缩技术

2008 37th International Conference on Parallel Processing Pub Date : 2008-09-09 DOI: 10.1109/ICPP.2008.69

Virendra J. Marathe, Michael F. Spear, M. Scott

引用次数: 64

A Scalable Architecture for Crowd Simulation: Implementing a Parallel Action Server 人群模拟的可扩展架构:实现并行动作服务器

2008 37th International Conference on Parallel Processing Pub Date : 2008-09-09 DOI: 10.1109/ICPP.2008.20

G. Vigueras, M. Lozano, C. Perez, J. Orduña

{"title":"A Scalable Architecture for Crowd Simulation: Implementing a Parallel Action Server","authors":"G. Vigueras, M. Lozano, C. Perez, J. Orduña","doi":"10.1109/ICPP.2008.20","DOIUrl":"https://doi.org/10.1109/ICPP.2008.20","url":null,"abstract":"Crowd simulation can be considered as a special case of virtual environments where avatars are intelligent agents instead of user-driven entities. These applications require both rendering visually plausible images of the virtual world and managing the behavior of autonomous agents. Although several proposals have focused on the software architectures for these systems, the scalability of crowd simulation is still an open issue. In this paper, we propose a scalable architecture that can manage large crowds of autonomous agents at interactive rates. This proposal consists of enhancing a previously proposed architecture through the efficient parallelization of the action server and the distribution of the semantic database. In this way, the system bottleneck is removed, and new action servers (hosted each one on a new computer) can be added as necessary. The evaluation results show that the proposed architecture is able to fully exploit the underlying hardware platform, regardless of both the number and the kind of computers that form the system. Therefore, this system architecture provides the scalability required for large-scale crowd simulation.","PeriodicalId":388408,"journal":{"name":"2008 37th International Conference on Parallel Processing","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130254272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 24

Cellular ANTomata: Food-Finding and Maze-Threading 细胞自动机:食物寻找和迷宫穿线

2008 37th International Conference on Parallel Processing Pub Date : 2008-09-09 DOI: 10.1109/ICPP.2008.13

A. Rosenberg

引用次数: 12

MFLUSH: Handling Long-Latency Loads in SMT On-Chip Multiprocessors MFLUSH:处理SMT片上多处理器中的长延迟负载

2008 37th International Conference on Parallel Processing Pub Date : 2008-09-09 DOI: 10.1109/ICPP.2008.48

Carmelo Acosta, F. Cazorla, Alex Ramírez, M. Valero

{"title":"MFLUSH: Handling Long-Latency Loads in SMT On-Chip Multiprocessors","authors":"Carmelo Acosta, F. Cazorla, Alex Ramírez, M. Valero","doi":"10.1109/ICPP.2008.48","DOIUrl":"https://doi.org/10.1109/ICPP.2008.48","url":null,"abstract":"Nowadays, there is a clear trend in industry towards employing the growing amount of transistors on chip in replicating execution cores (CMP), where each core is simultaneous multithreading (SMT). State-of-the-art high-performance processors like the IBM POWER5 and POWER6 corroborate this CMP+SMT trend. Within each SMT core any of the well-known SMT mechanisms may be applied to face SMT related challenges. Among them, probably the most important issue in an SMT execution pipeline concerns the instruction fetch (IFetch) Policy. The FLUSH IFetch Policy represents a choice for throughput-oriented scenarios. It handles L2 cache misses in order to avoid hardware resource monopolization by any given execution thread; involving an additional energy cost via instruction refetching. However, the new constraints imposed by the CMP+SMT scenario may affect well-known SMT mechanisms, like the FLUSH mechanism. In this paper we revisit the FLUSH mechanism and analyze its application in the emerging CMP+SMT scenario. The included analysis points out the new difficulties to be faced by the FLUSH mechanism in the emerging CMP+SMT scenario. Then we propose a novel IFetch Policy designed to cope with the CMP+SMT scenario: the MFLUSH. We also include a complete evaluation of the MFLUSH policy, both in terms of throughput and energy consumption. Our results indicate that the MFLUSH, specifically designed for the emerging CMP+SMT scenario, succeeds not only in overcoming the specific CMP+SMT constraints but also allowing a 20% energy consumption reduction without a significant system throughput loss.","PeriodicalId":388408,"journal":{"name":"2008 37th International Conference on Parallel Processing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127853843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Optimization of All-to-All Communication on the Blue Gene/L Supercomputer 蓝色基因/L超级计算机上全对全通信的优化

2008 37th International Conference on Parallel Processing Pub Date : 2008-09-09 DOI: 10.1109/ICPP.2008.83

Sameer Kumar, Yogish Sabharwal, R. Garg, P. Heidelberger

引用次数: 70

Improving Priority Enforcement via Non-Work-Conserving Scheduling 通过非工作节约调度改善优先级执行

2008 37th International Conference on Parallel Processing Pub Date : 2008-09-09 DOI: 10.1109/ICPP.2008.38

J. C. Saez, J. I. Gómez, M. Prieto

引用次数: 7

Optimized Workflow Orchestration of Database Aggregate Operations on Heterogenous Grids 异构网格上数据库聚合操作的优化工作流编排

2008 37th International Conference on Parallel Processing Pub Date : 2008-09-09 DOI: 10.1109/ICPP.2008.12

W. Mach, E. Schikuta

引用次数: 4

Towards Minimum Traffic Cost and Minimum Response Latency: A Novel Dynamic Query Protocol in Unstructured P2P Networks 面向最小流量代价和最小响应延迟:一种新的非结构化P2P网络动态查询协议

2008 37th International Conference on Parallel Processing Pub Date : 2008-09-09 DOI: 10.1109/ICPP.2008.78

Chen Tian, Hongbo Jiang, Xue Liu, Wenyu Liu, Yi Wang

{"title":"Towards Minimum Traffic Cost and Minimum Response Latency: A Novel Dynamic Query Protocol in Unstructured P2P Networks","authors":"Chen Tian, Hongbo Jiang, Xue Liu, Wenyu Liu, Yi Wang","doi":"10.1109/ICPP.2008.78","DOIUrl":"https://doi.org/10.1109/ICPP.2008.78","url":null,"abstract":"Controlled-flooding algorithms are widely used in unstructured networks. Expanding ring (ER) achieves low response delay, while its traffic cost is huge; dynamic querying (DQ) is known for its desirable behavior in traffic control, but it achieves lower search cost at the price of an undesirable latency performance; Enhanced dynamic querying (DQ+) can reduce the search latency too, while it is hard to determine a general optimum parameters set. In this paper, a novel algorithm named selective dynamic query (SDQ) is proposed. Unlike previous works that awkwardly processing floating TTL values, SDQ properly select an integer TTL value and a set of neighbors to narrow the scope of next query. Our experiments demonstrate that SDQ provides finer-grained control than other algorithms: its latency is close to the well-known minimum one via ER; in the mean time its traffic cost also close to the minimum. To our best knowledge, this is the first work capable of achieving best performance in terms of both response latency and traffic cost. In addition, our experiments also demonstrate that SDQ works well in various network topologies.","PeriodicalId":388408,"journal":{"name":"2008 37th International Conference on Parallel Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116286603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6