2008 International Conference on Parallel Processing - Workshops最新文献_第2页

Interconnected Traffic with Real Mobility Tool for Ad Hoc Networks 基于实时移动工具的Ad Hoc网络互联流量

2008 International Conference on Parallel Processing - Workshops Pub Date : 2008-09-08 DOI: 10.1109/ICPP-W.2008.32

A. Doci

引用次数: 3

A Novel Mobility Management Scheme for IEEE 802.11-Based Wireless Mesh Networks 一种新的基于IEEE 802.11的无线Mesh网络移动性管理方案

2008 International Conference on Parallel Processing - Workshops Pub Date : 2008-09-08 DOI: 10.1109/ICPP-W.2008.22

Zhenxia Zhang, A. Boukerche

引用次数: 8

An Analysis of QoS Provisioning for Sockets Direct Protocol vs. IPoIB over Modern InfiniBand Networks 现代InfiniBand网络中套接字直接协议与IPoIB的QoS配置分析

2008 International Conference on Parallel Processing - Workshops Pub Date : 2008-09-08 DOI: 10.1109/ICPP-W.2008.25

Ryan E. Grant, Mohammad J. Rashti, A. Afsahi

引用次数: 17

Understanding Locality-Awareness in Peer-to-Peer Systems 理解点对点系统中的位置感知

2008 International Conference on Parallel Processing - Workshops Pub Date : 2008-09-08 DOI: 10.1109/ICPP-W.2008.15

Xiongfei Weng, Hongliang Yu, G. Shi, Jing Chen, Xu Wang, Jing Sun, Weimin Zheng

引用次数: 12

OpenMPD: A Directive-Based Data Parallel Language Extension for Distributed Memory Systems OpenMPD:分布式存储系统的基于指令的数据并行语言扩展

2008 International Conference on Parallel Processing - Workshops Pub Date : 2008-09-08 DOI: 10.1109/ICPP-W.2008.28

Jinpil Lee, M. Sato, T. Boku

引用次数: 7

TCP/IP Performance Near I/O Bus Bandwidth on Multi-Core Systems: 10-Gigabit Ethernet vs. Multi-Port Gigabit Ethernet 多核系统上接近I/O总线带宽的TCP/IP性能:10千兆以太网与多端口千兆以太网

2008 International Conference on Parallel Processing - Workshops Pub Date : 2008-09-08 DOI: 10.1109/ICPP-W.2008.33

Hyun-Wook Jin, Yeon-Ji Yun, Hye-Churn Jang

{"title":"TCP/IP Performance Near I/O Bus Bandwidth on Multi-Core Systems: 10-Gigabit Ethernet vs. Multi-Port Gigabit Ethernet","authors":"Hyun-Wook Jin, Yeon-Ji Yun, Hye-Churn Jang","doi":"10.1109/ICPP-W.2008.33","DOIUrl":"https://doi.org/10.1109/ICPP-W.2008.33","url":null,"abstract":"With significant advances in network interfaces, I/O bus, and processor architecture of end node, innovative approaches are required to achieve high network bandwidth by fully utilizing available system resources. The issues related can be summarized into two: (i) Utilizing I/O bus bandwidth for high bandwidth network connection and (ii) Utilizing multiple cores for high packet processing throughput. In this paper, we conduct several experiments on a multi-core system with 10 GigE and multi-port 1 GigE network interfaces. We aim to show the impact of system configurations on the network performance and compare the performance of two different network interfaces. The experimental results show that, with the proper interrupt affinity configurations, the multi-port 1 GigE can achieve comparable bandwidth to 10 GigE. The peak bandwidth achieved by the multi-port 1 GigE is 6.7 Gbps, which is more than 80% of the theoretical maximum I/O bus bandwidth on the experimental system. We, however, also show that the multi-port 1 GigE can consume much more processor resource than 10 GigE. More importantly, we reveal that processing the packets on many cores can result in more resource consumption without much benefit. This can be because of locking overhead between softirqs running on different cores and lower cache efficiency. We show that the more tuning on the configuration cannot overcome this side effect.","PeriodicalId":231042,"journal":{"name":"2008 International Conference on Parallel Processing - Workshops","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120983143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Scheduling Task Graphs on Heterogeneous Multiprocessors with Reconfigurable Hardware 硬件可重构异构多处理器上的调度任务图

2008 International Conference on Parallel Processing - Workshops Pub Date : 2008-09-08 DOI: 10.1109/ICPP-W.2008.39

J. Teller, F. Özgüner, R. Ewing

引用次数: 7

Performance Analysis and Optimization of Parallel Scientific Applications on CMP Cluster Systems CMP集群系统上并行科学应用的性能分析与优化

2008 International Conference on Parallel Processing - Workshops Pub Date : 2008-09-08 DOI: 10.1109/ICPP-W.2008.21

Xingfu Wu, V. Taylor, Charles W. Lively, S. Sharkawi

{"title":"Performance Analysis and Optimization of Parallel Scientific Applications on CMP Cluster Systems","authors":"Xingfu Wu, V. Taylor, Charles W. Lively, S. Sharkawi","doi":"10.1109/ICPP-W.2008.21","DOIUrl":"https://doi.org/10.1109/ICPP-W.2008.21","url":null,"abstract":"Chip multiprocessors (CMP) are widely used for high performance computing. Further, these CMPs are being configured in a hierarchical manner to compose a node in a cluster system. A major challenge to be addressed is efficient use of such cluster systems for large-scale scientific applications. In this paper, we quantify the performance gap resulting from using different number of processors per node; this information is used to provide a baseline for the amount of optimization needed when using all processors per node on CMP clusters. We conduct detailed performance analysis to identify how applications can be modified to efficiently utilize all processors per node on CMP clusters, especially focusing on two scientific applications: a 3D particle-in-cell, magnetic fusion application gyrokinetic toroidal code (GTC) and a lattice Boltzmann method for simulating fluid dynamics (LBM). In terms of refinements, we use conventional techniques such as cache blocking, loop unrolling and loop fusion, and develop hybrid methods for optimizing MPI_Allreduce and MPI_Reduce. Using these optimizations, the application performance for utilizing all processors per node was improved by up to 18.97% for GTC and 15.77% for LBM on up to 2048 total processors on the CMP clusters.","PeriodicalId":231042,"journal":{"name":"2008 International Conference on Parallel Processing - Workshops","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116022490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 24

A Fuzzy-Based Handover System for Avoiding Ping-Pong Effect in Wireless Cellular Networks 无线蜂窝网络中避免乒乓效应的模糊切换系统

2008 International Conference on Parallel Processing - Workshops Pub Date : 2008-09-08 DOI: 10.1109/ICPP-W.2008.11

L. Barolli, F. Xhafa, A. Durresi, A. Koyama

引用次数: 35

A Fault Tolerance Scheme for Hierarchical Dynamic Schedulers in Grids 网格中分层动态调度的容错方案

2008 International Conference on Parallel Processing - Workshops Pub Date : 2008-09-08 DOI: 10.1109/ICPP-W.2008.7

Nitin B. Gorde, S. Aggarwal

引用次数: 20