{"title":"Discovering Cellular Automata Rules for Binary Classification Problem with Use of Genetic Algorithm","authors":"A. Piwonska, F. Seredyński, M. Szaban","doi":"10.1109/IPDPSW.2012.81","DOIUrl":"https://doi.org/10.1109/IPDPSW.2012.81","url":null,"abstract":"This paper proposes a cellular automata-based solution of a two-dimensional binary classification problem. The proposed method is based on a two-dimensional, three-state cellular automaton (CA) with the von Neumann neighborhood. Since the number of possible CA rules (potential CA-based classifiers) is huge, searching efficient rules is conducted with use of a genetic algorithm (GA). Experiments show an excellent performance of discovered rules in solving the classification problem. The best found rules perform better than the heuristic CA rule designed by a human and also better than one of the most widely used statistical method: the k-nearest neighbors algorithm (k-NN). Experiments show that CAs rules can be successfully reused in the process of searching new rules.","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130723070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On Running Windowed Image Computations on a Pipeline","authors":"R. Vaidyanathan, Phaneendra Vinukonda","doi":"10.1109/IPDPSW.2012.100","DOIUrl":"https://doi.org/10.1109/IPDPSW.2012.100","url":null,"abstract":"Many image processing operations manipulate an individual pixel using the values of other pixels in the given pixel's neighborhood. Such operations are called windowed operations. The size of the windowed operation is a measure of the size of the given pixel's neighborhood. A windowed computation applies a windowed operation on all pixels of the image. An image processing application is typically a sequence of windowed computations. While windowed computations admit high parallelism, the cost of inputting and outputting the image often restricts the computation to a few computational units. In this paper we analytically study the running of a sequence of z windowed computations, each of size w, on a z-stage pipelined computational model. For an N × N image and n × n input/output bandwidth per stage, we show that the sequence of windowed computations can be run in N2/n2 (1 + δ) steps, where δ = (n/N + 3n2/wN + zw/N). This produces a speed-up of z/1+δ over a single stage. Generally, N ≫ n >; z, w; so the overhead, δ, is dominated by the term which is typically small. This also indicates the time to be relatively independent of the number of stages z.","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131419107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MapReduce Based Skyline Services Selection for QoS-aware Composition","authors":"Liang Chen, Li Kuang, Jian Wu","doi":"10.1109/IPDPSW.2012.253","DOIUrl":"https://doi.org/10.1109/IPDPSW.2012.253","url":null,"abstract":"Service selection is an important issue of Service-oriented computing (SOC), which is a fundamental step to the composition of complex and large-grained services from single-function components. Skyline operation is recently adopted to select candidate services for composition, as skyline services have better QoS. However, the fast increasing web services, multiple quality attributes to be considered, and dynamic service environment pose a big challenge to skyline service selection. In this paper, we present a parallel skyline service selection method to improve the efficiency by upgrading the MapReduce paradigm. In particular, an angle-based data space partitioning approach is employed in our MapReduce based skyline service selection. To handle the dynamic nature of service environment, we employ Paper-Tape (PT) Model which is used to rapidly locate varying services, and present a dynamic skyline service selection algorithm based on PT model. By experimenting over 10,000 web services along 10 quality attributes, we demonstrate the efficiency of our proposed methods.","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"268 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131558518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hailong Yang, Zhongzhi Luan, Wenjun Li, D. Qian, Gang Guan
{"title":"Statistics-based Workload Modeling for MapReduce","authors":"Hailong Yang, Zhongzhi Luan, Wenjun Li, D. Qian, Gang Guan","doi":"10.1109/IPDPSW.2012.254","DOIUrl":"https://doi.org/10.1109/IPDPSW.2012.254","url":null,"abstract":"Large-scale data-intensive computing with MapReduce framework in Cloud is becoming pervasive for the core business of many academic, government, and industrial organizations. Hadoop is by far the most successful realization of MapReduce framework. While MapReduce is easy-to-use, efficient and reliable for data-intensive computations, the excessive configuration parameters in Hadoop cause unexpected challenges when running various workloads with Hadoop cluster effectively. Consequently, developers who have less experience with the Hadoop configuration system may devote a significant effort to write an application with poor performance, because they have no idea how these configurations would influence the performance, or they are not even aware that these configurations exist. In this paper, we propose a statistic analysis approach to identify the relationships among workload characteristics, Hadoop configurations and workload performance. Several non-intuitive relationships between workload characteristics and relative performance are revealed and the experimental results demonstrate that our regression models accurately predict the performance of MapReduce workloads under different Hadoop configurations.","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132385312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On-line Batch Scheduling in Distributed Optical Networks","authors":"Yang Wang, Xiaojun Cao, A. Caciula, Qian Hu","doi":"10.1109/IPDPSW.2012.109","DOIUrl":"https://doi.org/10.1109/IPDPSW.2012.109","url":null,"abstract":"Batch scheduling accommodates a group of tasks with the start/end time constraints to maximize the revenue from scheduling tasks over a number of servers, which has been extensively studied in the context of Job-machine scheduling. In optical networks, batch scheduling refers to the process of scheduling a group of data units (i.e., the jobs) that competing for the same set of wavelength channels (i.e., the machines). Classical Job-machine scheduling studies considered both the case of a pure-loss system, and the case with waiting rooms (i.e., buffers), which are generally in the form of Random Access Memory (RAM). In optical networks, the buffering is achieved by feeding the optical signal into a fixed length of fiber, namely Fiber Delay Lines, since optical RAM is not yet available. The unique feature of the discrete and predefined buffering time in fact instantiates a new type of problem, namely Job-machine scheduling with Discrete-time Buffers. In this work, we comprehensively study batch scheduling in optical networks. We show that batch scheduling with and without FDLs corresponds to two different instances of Job-machine scheduling problem. While proving their NP-Completeness, we mathematically model both cases using Integer Linear Programming formulations to provide an optimal scheduling. Given the timeliness request for on-line batch scheduling and the dramatic problem size in optical networks, we also propose polynomial-time heuristic algorithms, which are shown to be near-optimal in our simulations.","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132439279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Development of Parallel Adaptive Sampling Algorithms for Analyzing Biological Networks","authors":"K. Dempsey, K. Duraisamy, S. Bhowmick, H. Ali","doi":"10.1109/IPDPSW.2012.90","DOIUrl":"https://doi.org/10.1109/IPDPSW.2012.90","url":null,"abstract":"The availability of biological data in massive scales continues to represent unlimited opportunities as well as great challenges in bioinformatics research. Developing innovative data mining techniques and efficient parallel computational methods to implement them will be crucial in extracting useful knowledge from this raw unprocessed data, such as in discovering significant cellular subsystems from gene correlation networks. In this paper, we present a scalable combinatorial sampling technique, based on identifying maximum chordal sub graphs, that reduces noise from biological correlation networks, thereby making it possible to find biologically relevant clusters from the filtered network. We show how selecting the appropriate filter is crucial in maintaining the key structures from the original networks and uncovering new ones after removing noisy relationships. We also conduct one of the first comparisons in two important sensitivity criteria - the perturbation due to the vertex numbers of the network and perturbations due to data distribution. We demonstrate that our chordal-graph based filter is effective across many different vertex permutations, as is our parallel implementation of the sampling algorithm.","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128131133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ahmet Erdem Sarıyüce, Erik Saule, Ümit V. Çatalyürek
{"title":"Scalable Hybrid Implementation of Graph Coloring Using MPI and OpenMP","authors":"Ahmet Erdem Sarıyüce, Erik Saule, Ümit V. Çatalyürek","doi":"10.1109/IPDPSW.2012.216","DOIUrl":"https://doi.org/10.1109/IPDPSW.2012.216","url":null,"abstract":"Graph coloring algorithms are commonly used in large scientific parallel computing either for identifying parallelism or as a tool to reduce computation, such as compressing Hessian matrices. Large scientific computations are nowadays either run on commodity clusters or on large computing platforms. In both cases, the current target platform is hierarchical with distributed memory at the node level and shared memory at the processor level. In this paper, we present a novel hybrid graph coloring algorithm and discuss how to obtain the best performance on such systems from algorithmic, system and engineering perspectives.","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128217274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Defeating against Sybil-attacks in Peer-to-peer Networks","authors":"X. Xiang","doi":"10.1109/IPDPSW.2012.149","DOIUrl":"https://doi.org/10.1109/IPDPSW.2012.149","url":null,"abstract":"There has been a spurt of works showing that the existence of sybil attacks is a serious threat to Peer-to-Peer networks, where one or more attackers can forge a large number of fictitious identities. In this paper, we present a distributed protocol to reduce the adverse effects of sybil attacks in free riding problem. Our approach focuses on restricting nodes to obtain the number of service units in a reasonable level. Unlike other protocols, our protocol works well even if there are a large number of sybil nodes in the network. Our results show the promise of the protocol in limiting sybil attacks while not sacrificing application performance.","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131684926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards High-Level Programming of Multi-GPU Systems Using the SkelCL Library","authors":"Michel Steuwer, Philipp Kegel, S. Gorlatch","doi":"10.1109/IPDPSW.2012.229","DOIUrl":"https://doi.org/10.1109/IPDPSW.2012.229","url":null,"abstract":"Application programming for GPUs (Graphics Processing Units) is complex and error-prone, because the popular approaches - CUDA and OpenCL - are intrinsically low-level and offer no special support for systems consisting of multiple GPUs. The SkelCL library presented in this paper is built on top of the OpenCL standard and offers pre-implemented recurring computation and communication patterns (skeletons) which greatly simplify programming for multi-GPU systems. The library also provides an abstract vector data type and a high-level data (re)distribution mechanism to shield the programmer from the low-level data transfers between the system's main memory and multiple GPUs. In this paper, we focus on the specific support in SkelCL for systems with multiple GPUs and use a real-world application study from the area of medical imaging to demonstrate the reduced programming effort and competitive performance of SkelCL as compared to OpenCL and CUDA. Besides, we illustrate how SkelCL adapts to large-scale, distributed heterogeneous systems in order to simplify their programming.","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"26 3‐4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132227470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Comparison of DAG and Mesh Topologies for Coarse-Grain Reconfigurable Array","authors":"Jonathan Antusiak, Antoine Trouvé, K. Murakami","doi":"10.1109/IPDPSW.2012.24","DOIUrl":"https://doi.org/10.1109/IPDPSW.2012.24","url":null,"abstract":"In this paper, we address the hardware overhead of the dynamically reconfigurable functional unit (DRFU) in dynamically reconfigurable processors (DRP), in the context of low-power, embedded system-on-chips (E-SoC). We consider a tightly coupled DRP with a small, coarse-grain DRFU made of four columns of four ALUs. These are interconnected following one of the following interconnection scheme: direct acyclic graph or mesh. Given a large set of of custom instructions to map on the DRFU, we explore the simplification opportunities on the DRFU in order to reduce its hardware cost. We determine that it is possible to reduce its footprint by about 70 % with respect to the ALUs for both topologies and 50 % with respect to the interconnection between ALUs. We also provide the place and route algorithm to achieve these results. At the end of the paper we compare both topologies with respect to the hardware usage, the opportunities for simplifications and the complexity of the place and route algorithm. We conclude that the mesh topology is in all the cases the most desirable.","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134101342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}