{"title":"Automatic Evaluation of the Computation Structure of Parallel Applications","authors":"Juan Gonzalez, Judit Giménez, Jesús Labarta","doi":"10.1109/PDCAT.2009.52","DOIUrl":"https://doi.org/10.1109/PDCAT.2009.52","url":null,"abstract":"Many data mining techniques have been proposed for parallel applications performance analysis, the most interesting being clustering analysis. Most cases have been used to detect processors with similar behavior. In previous work, we presented a different approach: clustering was used to detect the computation structure of the applications and how these different computation phases behave. In this paper, we present a method to evaluate the accuracy of this structure detection. This new method is based on the Single Program Multiple Data (SPMD) paradigm exhibited by real parallel programs. Assuming an SPMD structure, we expect that all tasks of a parallel application execute the same operation sequence. Using a Multiple Sequence Alignment (MSA) algorithm, we check the sequence ordering of the detected clusters to evaluate the quality of the clustering results.","PeriodicalId":312929,"journal":{"name":"2009 International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"26 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130699430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaoyi Lu, Yongqiang Zou, F. Xiong, Jian Lin, L. Zha
{"title":"ICOMC: Invocation Complexity Of Multi-Language Clients for Classified Web Services and its Impact on Large Scale SOA Applications","authors":"Xiaoyi Lu, Yongqiang Zou, F. Xiong, Jian Lin, L. Zha","doi":"10.1109/PDCAT.2009.74","DOIUrl":"https://doi.org/10.1109/PDCAT.2009.74","url":null,"abstract":"Theoretically, multi-language clients invocating web services is no longer a problem due to XML-based interface descriptions by WSDL, but the reality is not so good. Some implementation level difficulties still exist when invoking web services from clients in different programming languages. These difficulties are caused by involving complex data structures in the service interface, carrying additional information such as WS-security headers in the SOAP messages, missing language features such as Reflection in C/C++ and so on, which make large scale multi-language SOA application development a time-consuming and buggy work. This paper proposes a new complexity ICOMC, short for Invocation Complexity Of Multi-language Clients, to quantify these difficulties, introduces implementation cost and runtime performance metrics for ICOMC, and indentifies three factors dominating the ICOMC: service interface, message context, and language feature. Consequently, the problem is formulated as finding out the correlation of the three factors to ICOMC. To simplify the problem, web services are classified into four categories: SISM, SICM, CISM and CICM according to service interface complexity and message context complexity. Furthermore, micro-benchmark experiments are done in C/C++/Java for all four categories. This paper also takes the GOS System Software of the China National Grid as a real large scale application to implement its C/C++ client APIs and compare them with the original Java APIs. Evaluations based on micro-benchmarks and real application show the correlations between the factors and ICOMC. Our results benefit web service interface designing, appropriate language adoption, and implementation cost / runtime performance estimation.","PeriodicalId":312929,"journal":{"name":"2009 International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130291702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pascal Bolzhauser, Anthony Sulistio, Gerhard Angst, C. Reich
{"title":"Parallelized Critical Path Search in Electrical Circuit Designs","authors":"Pascal Bolzhauser, Anthony Sulistio, Gerhard Angst, C. Reich","doi":"10.1109/PDCAT.2009.83","DOIUrl":"https://doi.org/10.1109/PDCAT.2009.83","url":null,"abstract":"For finding the critical path in electrical circuit designs, a shortest-path search must be carried out. This paper introduces a new two-level shortest-path search algorithm specially adapted for parallelization. The proposed algorithm is based on a module-based partitioning algorithm and a shortest-path search parallelized for the usage on multi-core systems. Experimental results show the impact of this approach.","PeriodicalId":312929,"journal":{"name":"2009 International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132959172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Theoretical and Empirical Analysis of a GPU Based Parallel Bayesian Optimization Algorithm","authors":"Asim Munawar, M. Wahib, M. Munetomo, K. Akama","doi":"10.1109/PDCAT.2009.32","DOIUrl":"https://doi.org/10.1109/PDCAT.2009.32","url":null,"abstract":"General Purpose computing over Graphical Processing Units (GPGPUs) is a huge shift of paradigm in parallel computing that promises a dramatic increase in performance. But GPGPUs also bring an unprecedented level of complexity in algorithmic design and software development. In this paper we describe the challenges and design choices involved in parallelization of Bayesian Optimization Algorithm (BOA) to solve complex combinatorial optimization problems over nVidia commodity graphics hardware using Compute Unified Device Architecture (CUDA). BOA is a well-known multivariate Estimation of Distribution Algorithm (EDA) that incorporates methods for learning Bayesian Network (BN). It then uses BN to sample new promising solutions. Our implementation is fully compatible with modern commodity GPUs and therefore we call it gBOA (BOA on GPU). In the results section, we show several numerical tests and performance measurements obtained by running gBOA over an nVidia Tesla C1060 GPU. We show that in the best case we can obtain a speedup of up to 13x.","PeriodicalId":312929,"journal":{"name":"2009 International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114726262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Takizawa, Katsuto Sato, K. Komatsu, Hiroaki Kobayashi
{"title":"CheCUDA: A Checkpoint/Restart Tool for CUDA Applications","authors":"H. Takizawa, Katsuto Sato, K. Komatsu, Hiroaki Kobayashi","doi":"10.1109/PDCAT.2009.78","DOIUrl":"https://doi.org/10.1109/PDCAT.2009.78","url":null,"abstract":"In this paper, a tool named CheCUDA is designed to checkpoint CUDA applications that use GPUs as accelerators. As existing checkpoint/restart implementations do not support checkpointing the GPU status, CheCUDA hooks a part of basic CUDA driver API calls in order to record the status changes on the main memory. At checkpointing, CheCUDA stores the status changes in a file after copying all necessary data in the video memory to the main memory and then disabling the CUDA runtime. At restarting, CheCUDA reads the file, re-initializes the CUDA runtime, and recovers the resources on GPUs so as to restart from the stored status. This paper demonstrates that a prototype implementation of CheCUDA can correctly checkpoint and restart a CUDA application written with basic APIs. This also indicates that CheCUDA can migrate a process from one PC to another even if the process uses a GPU. Accordingly, CheCUDA is useful not only to enhance the dependability of CUDA applications but also to enable dynamic task scheduling of CUDA applications required especially on heterogeneous GPU cluster systems. This paper also shows the timing overhead for checkpointing.","PeriodicalId":312929,"journal":{"name":"2009 International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"429 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116541984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Consistent Fixed Points and Negative Gain","authors":"H. B. Acharya, E. Elmallah, M. Gouda","doi":"10.1109/PDCAT.2009.85","DOIUrl":"https://doi.org/10.1109/PDCAT.2009.85","url":null,"abstract":"We discuss the stabilization properties of networks that are composed of “ displacement elements”. Each displacement element is defined by an integer K, called the displacement of the element, an input variable x, and an output variable y, where the values of x and y are non-negative integers. An execution step of this element assigns to y the maximum of 0 and K + x. The objective of our discussion is to demonstrate that two principles play an important role in ensuring that a network N is stabilizing, i. e. starting from any global state, network N is guaranteed to reach a global fixed point. Specifically, the principle of consistent fixed points is analogous to the requirement that a control system be free from self-oscillations. And the principle of negative gain is analogous to the requirement that the feedback loop of a sum of displacements along every directed loop in network N is negative.","PeriodicalId":312929,"journal":{"name":"2009 International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127591182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Peric, T. Bocek, F. Hecht, D. Hausheer, B. Stiller
{"title":"The Design and Evaluation of a Distributed Reliable File System","authors":"D. Peric, T. Bocek, F. Hecht, D. Hausheer, B. Stiller","doi":"10.1109/PDCAT.2009.37","DOIUrl":"https://doi.org/10.1109/PDCAT.2009.37","url":null,"abstract":"Peer-to-peer (P2P) systems are, in contrast to clientserver (C/S) systems, fault-tolerant, robust, and scalable. While C/S distributed file systems, such as NFS (Network File System) or SMB (Server Message Block), do not scale with respect to the number of clients and exhibit a single point of failure, P2P file systems have the potential to cope with an increasing number of participants. Thus, this paper presents DRFS (Distributed Reliable File System), a P2P file system for cooperative environments. DRFS uses random, content-independent identifiers for data storage, while maintaining high performance and low overhead with many concurrent reads and writes. A dynamic replication mechanism ensures data availability, even under high churn. The application scenario considers an office environment, where DRFS is installed on employees’ machines, who store and request files. DRFS has been implemented using the Filesystem in Userspace (FUSE) interface, in order to provide users with transparent read and write operations. Experiments show the benefits of such a peer-to-peer architecture, when a small number of peers reads or writes in parallel: DRFS performs better than NFS, as soon as 6 peers read or write in parallel a 32 MB file. For unpopular files, it is also more reliable than IgorFS.","PeriodicalId":312929,"journal":{"name":"2009 International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129843820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. Anceaume, F. Brasileiro, R. Ludinard, B. Sericola, F. Tronel
{"title":"Analytical Study of Adversarial Strategies in Cluster-based Overlays","authors":"E. Anceaume, F. Brasileiro, R. Ludinard, B. Sericola, F. Tronel","doi":"10.1109/PDCAT.2009.62","DOIUrl":"https://doi.org/10.1109/PDCAT.2009.62","url":null,"abstract":"Awerbuch and Scheideler have shown that peer-to-peer overlays networks can survive Byzantine attacks only if malicious nodes are not able to predict what will be the topology of the network for a given sequence of join and leave operations. In this paper we investigate adversarial strategies by following specific protocols. Our analysis demonstrates first that an adversary can very quickly subvert DHT-based overlays by simply never triggering leave operations. We then show that when all nodes (honest and malicious ones) are imposed on a limited lifetime, the system eventually reaches a stationary regime where the ratio of polluted clusters is bounded, independently from the initial amount of corruption in the system.","PeriodicalId":312929,"journal":{"name":"2009 International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124074411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}