{"title":"Interconnected Traffic with Real Mobility Tool for Ad Hoc Networks","authors":"A. Doci","doi":"10.1109/ICPP-W.2008.32","DOIUrl":"https://doi.org/10.1109/ICPP-W.2008.32","url":null,"abstract":"Due to the slow deployment of ad hoc networks, their protocol performance is mainly measured in simulation environments and uses synthetic mobility and traffic models. The synthetic mobility and traffic models are designed independently of each other and work under the assumption that wireless nodes start and remain in the simulation for a user-specified simulation time. This paper shows that mobility and traffic are interconnected. We announce the implementation of the interconnected traffic tool and show that under real mobility and interconnected traffic the performance metrics need to be re-thought. Therefore, we propose availability as a new performance metric and evaluate protocol performance under synthetic and real mobility models. We offer the code to anyone interested.","PeriodicalId":231042,"journal":{"name":"2008 International Conference on Parallel Processing - Workshops","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122225538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Novel Mobility Management Scheme for IEEE 802.11-Based Wireless Mesh Networks","authors":"Zhenxia Zhang, A. Boukerche","doi":"10.1109/ICPP-W.2008.22","DOIUrl":"https://doi.org/10.1109/ICPP-W.2008.22","url":null,"abstract":"Recent advances in Wireless Mesh Networks (WMNs) have overcome the drawbacks of traditional wired networks and wireless ad hoc networks. WMNs are going to play a highly promising role in the next generation of networks. Mobility management is one of the most significant management services for WMNs. Due to the inherent characteristics of WMNs, such as relatively static backbones and highly mobile clients, the question of how to provide seamless mobility management for WMNs is the driving force behind research. In this paper, a novel intra-domain mobility management scheme for WMNs is presented. A hybrid routing algorithm is used to forward packets, and during handoff, gratuitous ARP messages are used to provide the new routing information, thus avoiding re-routing and location update. Real-time applications over 802.11 WMNs can be supported by this scheme, such as VoIP, etc.","PeriodicalId":231042,"journal":{"name":"2008 International Conference on Parallel Processing - Workshops","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116636449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Analysis of QoS Provisioning for Sockets Direct Protocol vs. IPoIB over Modern InfiniBand Networks","authors":"Ryan E. Grant, Mohammad J. Rashti, A. Afsahi","doi":"10.1109/ICPP-W.2008.25","DOIUrl":"https://doi.org/10.1109/ICPP-W.2008.25","url":null,"abstract":"The introduction of quality of service (QoS) features for socket-based communication over InfiniBand networks provides the opportunity to enact service differentiation for traditional socket-based applications over high performance networks for the first time. The effectiveness of such techniques in providing control over the quality of service that individual connections experience is important in managing traffic in modern data centers. In this paper, we quantitatively analyze the performance benefits of QoS provisioning in InfiniBand networks for sockets direct protocol (SDP) and IPoIB. We find that QoS provisioning can provide prioritized service for sockets-based streams, with more apparent impact on SDP traffic than IPoIB.","PeriodicalId":231042,"journal":{"name":"2008 International Conference on Parallel Processing - Workshops","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114449494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Understanding Locality-Awareness in Peer-to-Peer Systems","authors":"Xiongfei Weng, Hongliang Yu, G. Shi, Jing Chen, Xu Wang, Jing Sun, Weimin Zheng","doi":"10.1109/ICPP-W.2008.15","DOIUrl":"https://doi.org/10.1109/ICPP-W.2008.15","url":null,"abstract":"Locality-awareness is one of the essential characteristics for peer-to-peer (P2P) systems. Recently, many locality-aware algorithms have been proposed, in which locality can be defined as different network metrics. In this paper, we compare different performance optimization goals between peer users and ISPs, and then present a detailed simulation study to accurately explore how locality-aware algorithms based on different network metrics influence the performance of real P2P systems. Two widely deployed P2P systems, including BitTorrent, a content-distribution system, and CoolStreaming, a media streaming system, are tested under the real data set from PlanetLab in our extensive simulations. Experimental results suggest that selecting neighbors within the same AS is desirable, which can decrease user experienced delays and keep traffic locality.","PeriodicalId":231042,"journal":{"name":"2008 International Conference on Parallel Processing - Workshops","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127850244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OpenMPD: A Directive-Based Data Parallel Language Extension for Distributed Memory Systems","authors":"Jinpil Lee, M. Sato, T. Boku","doi":"10.1109/ICPP-W.2008.28","DOIUrl":"https://doi.org/10.1109/ICPP-W.2008.28","url":null,"abstract":"Open MPD is a language extension for programming on distributed memory systems that helps users by having minimal and simple notations. Although MPI is the de facto standard for parallel programming on distributed memory systems, writing MPI programs is often a time-consuming and complicated process. Open MPD supports typical parallelization-based on the data parallel paradigm and work sharing, and enables parallelizing the original sequential code using minimal modification with simple directives, like Open MP. And for flexibility, it allows to combine with explicit MPI coding on parallelization with Open MP for more complicated parallel codes. Experimental results of our implementation show that Open MPD achieves three to eight times speed-up on a PC cluster with eight processors given a small modification to the original sequential code.","PeriodicalId":231042,"journal":{"name":"2008 International Conference on Parallel Processing - Workshops","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124645503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TCP/IP Performance Near I/O Bus Bandwidth on Multi-Core Systems: 10-Gigabit Ethernet vs. Multi-Port Gigabit Ethernet","authors":"Hyun-Wook Jin, Yeon-Ji Yun, Hye-Churn Jang","doi":"10.1109/ICPP-W.2008.33","DOIUrl":"https://doi.org/10.1109/ICPP-W.2008.33","url":null,"abstract":"With significant advances in network interfaces, I/O bus, and processor architecture of end node, innovative approaches are required to achieve high network bandwidth by fully utilizing available system resources. The issues related can be summarized into two: (i) Utilizing I/O bus bandwidth for high bandwidth network connection and (ii) Utilizing multiple cores for high packet processing throughput. In this paper, we conduct several experiments on a multi-core system with 10 GigE and multi-port 1 GigE network interfaces. We aim to show the impact of system configurations on the network performance and compare the performance of two different network interfaces. The experimental results show that, with the proper interrupt affinity configurations, the multi-port 1 GigE can achieve comparable bandwidth to 10 GigE. The peak bandwidth achieved by the multi-port 1 GigE is 6.7 Gbps, which is more than 80% of the theoretical maximum I/O bus bandwidth on the experimental system. We, however, also show that the multi-port 1 GigE can consume much more processor resource than 10 GigE. More importantly, we reveal that processing the packets on many cores can result in more resource consumption without much benefit. This can be because of locking overhead between softirqs running on different cores and lower cache efficiency. We show that the more tuning on the configuration cannot overcome this side effect.","PeriodicalId":231042,"journal":{"name":"2008 International Conference on Parallel Processing - Workshops","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120983143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scheduling Task Graphs on Heterogeneous Multiprocessors with Reconfigurable Hardware","authors":"J. Teller, F. Özgüner, R. Ewing","doi":"10.1109/ICPP-W.2008.39","DOIUrl":"https://doi.org/10.1109/ICPP-W.2008.39","url":null,"abstract":"We address the problem of scheduling applications represented as directed acyclic task graphs (DAGs) onto architectures with reconfigurable processing cores. We introduce the Mutually Exclusive Processor Groups reconfiguration model, a novel reconfiguration model that captures many different modes of reconfiguration. Additionally, we propose the Heterogeneous Earliest Finish Time with Mutually Exclusive Processor Groups (HEFT-MEG) scheduling heuristic using the Mutually Exclusive Processor Groups reconfiguration model. HEFT-MEG schedules reconfigurations using a novel back-tracking algorithm to evaluate how different reconfiguration decisions affect previously scheduled tasks. HEFT-MEG's goal when choosing configurations is to choose the most efficient configuration for different application phases. In simulation, HEFT-MEG generates higher quality schedules than those generated by the hardware-software co-scheduler proposed by Mei, et al. [21] and HEFT [31] using a single configuration.","PeriodicalId":231042,"journal":{"name":"2008 International Conference on Parallel Processing - Workshops","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115497059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xingfu Wu, V. Taylor, Charles W. Lively, S. Sharkawi
{"title":"Performance Analysis and Optimization of Parallel Scientific Applications on CMP Cluster Systems","authors":"Xingfu Wu, V. Taylor, Charles W. Lively, S. Sharkawi","doi":"10.1109/ICPP-W.2008.21","DOIUrl":"https://doi.org/10.1109/ICPP-W.2008.21","url":null,"abstract":"Chip multiprocessors (CMP) are widely used for high performance computing. Further, these CMPs are being configured in a hierarchical manner to compose a node in a cluster system. A major challenge to be addressed is efficient use of such cluster systems for large-scale scientific applications. In this paper, we quantify the performance gap resulting from using different number of processors per node; this information is used to provide a baseline for the amount of optimization needed when using all processors per node on CMP clusters. We conduct detailed performance analysis to identify how applications can be modified to efficiently utilize all processors per node on CMP clusters, especially focusing on two scientific applications: a 3D particle-in-cell, magnetic fusion application gyrokinetic toroidal code (GTC) and a lattice Boltzmann method for simulating fluid dynamics (LBM). In terms of refinements, we use conventional techniques such as cache blocking, loop unrolling and loop fusion, and develop hybrid methods for optimizing MPI_Allreduce and MPI_Reduce. Using these optimizations, the application performance for utilizing all processors per node was improved by up to 18.97% for GTC and 15.77% for LBM on up to 2048 total processors on the CMP clusters.","PeriodicalId":231042,"journal":{"name":"2008 International Conference on Parallel Processing - Workshops","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116022490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Fuzzy-Based Handover System for Avoiding Ping-Pong Effect in Wireless Cellular Networks","authors":"L. Barolli, F. Xhafa, A. Durresi, A. Koyama","doi":"10.1109/ICPP-W.2008.11","DOIUrl":"https://doi.org/10.1109/ICPP-W.2008.11","url":null,"abstract":"Many handover algorithms are proposed in the literature. However, to make a better handover and keep the QoS in wireless networks is very difficult. In this paper, we propose a new handover system based on fuzzy logic. The proposed system uses 3 parameters for handoff decision: the change of signal strength of the present Base Station (BS), signal strength from the neighbor BS, and the distance between Mobile Station (MS) and BS. The performance evaluation via simulations shows that proposed system can avoid ping-pong effect and has a good handover decision.","PeriodicalId":231042,"journal":{"name":"2008 International Conference on Parallel Processing - Workshops","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130818530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Fault Tolerance Scheme for Hierarchical Dynamic Schedulers in Grids","authors":"Nitin B. Gorde, S. Aggarwal","doi":"10.1109/ICPP-W.2008.7","DOIUrl":"https://doi.org/10.1109/ICPP-W.2008.7","url":null,"abstract":"In dynamic grid environment failures (e.g. link down, resource failures) are frequent. We present a fault tolerance scheme for hierarchical dynamic scheduler (HDS) for grid workflow applications. In HDS all resources are arranged in a hierarchy tree and each resource acts as a scheduler. The fault tolerance scheme is fully distributed and is responsible for maintaining the hierarchy tree in the presence of failures. Our fault tolerance scheme handles root failures specially, which avoids root becoming single point of failure. The resources detecting failures are responsible for taking appropriate actions. Our fault tolerance scheme uses randomization to get rid of multiple simultaneous failures. Our simulation results show that the recovery process is fast and the failures affect minimally to the scheduling process.","PeriodicalId":231042,"journal":{"name":"2008 International Conference on Parallel Processing - Workshops","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131335342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}