{"title":"A Model for Automatic On-Line Process Behavior Extraction, Classification and Prediction in Heterogeneous Distributed Systems","authors":"E. Dodonov, R. Mello","doi":"10.1109/CCGRID.2007.6","DOIUrl":"https://doi.org/10.1109/CCGRID.2007.6","url":null,"abstract":"The performance of distributed computing is usually limited by the heterogeneous nature of distributed systems, data transfer rate and access latency. Techniques such as adaptive process migration, data caching and prefetching were developed to overcome this limitation. However, such techniques require the knowledge of application behavior in order to be effective. In this sense, we intend to propose a new model for application behavior prediction that, by classifying and analyzing application access patterns, is able to predict future application behavior. The model aims to allow a transparent and automatic process behavior extraction, classification and prediction, using a variable set of techniques, including stochastic models and artificial intelligence-based approaches.","PeriodicalId":278535,"journal":{"name":"Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133949717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Transparent Symmetric Active/Active Replication for Service-Level High Availability","authors":"C. Engelmann, S. Scott, C. Leangsuksun, Xubin He","doi":"10.1109/CCGRID.2007.116","DOIUrl":"https://doi.org/10.1109/CCGRID.2007.116","url":null,"abstract":"As service-oriented architectures become more important in parallel and distributed computing systems, individual service instance reliability as well as appropriate service redundancy becomes an essential necessity in order to increase overall system availability. This paper focuses on providing redundancy strategies using service-level replication techniques. Based on previous research using symmetric active/active replication, this paper proposes a transparent symmetric active/active replication approach that allows for more reuse of code between individual service-level replication implementations by using a virtual communication layer. Service- and client-side interceptors are utilized in order to provide total transparency. Clients and servers are unaware of the replication infrastructure as it provides all necessary mechanisms internally.","PeriodicalId":278535,"journal":{"name":"Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133059480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Simulation Study of Data Partitioning Algorithms for Multiple Clusters","authors":"Chen Yu, D. Marinescu, H. Siegel, J. Morrison","doi":"10.1109/CCGRID.2007.13","DOIUrl":"https://doi.org/10.1109/CCGRID.2007.13","url":null,"abstract":"Recently we proposed algorithms for concurrent execution on multiple clusters [11]. In this case, data partitioning is done at two levels; first, the data is distributed to a collection of heterogeneous parallel systems with different resources and startup time, then, on each system the data is evenly partitioned to the available nodes. In this paper, we report on a simulation study of the algorithms.","PeriodicalId":278535,"journal":{"name":"Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122121741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Understanding the Impact of Multi-Core Architecture in Cluster Computing: A Case Study with Intel Dual-Core System","authors":"Lei Chai, Qi Gao, D. Panda","doi":"10.1109/CCGRID.2007.119","DOIUrl":"https://doi.org/10.1109/CCGRID.2007.119","url":null,"abstract":"Multi-core processors are growing as a new industry trend as single core processors rapidly reach the physical limits of possible complexity and speed. In the new Top500 supercomputer list, more than 20% processors belong to the multi-core processor family. However, without an in-depth study on application behaviors and trends on multi-core clusters, we might not be able to understand the characteristics of multi-core cluster in a comprehensive manner and hence not be able to get optimal performance. In this paper, we take on these challenges and design a set of experiments to study the impact of multi-core architecture on cluster computing. We choose to use one of the most advanced multi-core servers, Intel Bensley system with Woodcrest processors, as our evaluation platform, and use benchmarks including HPL, NAMD, and NAS as the applications to study. From our message distribution experiments, we find that on an average about 50% messages are transferred through intra-node communication, which is much higher than intuition. This trend indicates that optimizing intra- node communication is as important as optimizing inter- node communication in a multi-core cluster. We also observe that cache and memory contention may be a potential bottleneck in multi-core clusters, and communication middleware and applications should be multi-core aware to alleviate this problem. We demonstrate that multi-core aware algorithm, e.g. data tiling, improves benchmark execution time by up to 70%. We also compare the scalability of a multi-core cluster with that of a single-core cluster and find that the scalability of the multi-core cluster is promising.","PeriodicalId":278535,"journal":{"name":"Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124921119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. Williams, Philippe Huibonhoa, J. Holliday, A. Hospodor, T. Schwarz
{"title":"Redundancy Management for P2P Storage","authors":"C. Williams, Philippe Huibonhoa, J. Holliday, A. Hospodor, T. Schwarz","doi":"10.1109/CCGRID.2007.93","DOIUrl":"https://doi.org/10.1109/CCGRID.2007.93","url":null,"abstract":"P2P storage systems must protect data against temporary unavailability and the effects of churn in order to become platforms for safe storage. This paper evaluates and compares redundancy techniques for P2P storage according to availability, accessibility, and maintainability using traces and theoretical results.","PeriodicalId":278535,"journal":{"name":"Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130353340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploiting Heterogeneity for Collective Data Downloading in Volunteer-based Networks","authors":"Jinoh Kim, A. Chandra, J. Weissman","doi":"10.1109/CCGRID.2007.50","DOIUrl":"https://doi.org/10.1109/CCGRID.2007.50","url":null,"abstract":"Scientific computing is being increasingly deployed over volunteer-based distributed computing environments consisting of idle resources on donated user machines. A fundamental challenge in these environments is the dissemination of data to the computation nodes, with the successful completion of jobs being driven by the efficiency of collective data download across compute nodes, and not only the individual download times. This paper considers the use of a data network consisting of data distributed across a set of data servers, and focuses on the server selection problem: how do individual nodes select a server for downloading data to minimize the communication makespan - the maximal download time for a data file. Through experiments conducted on a pastry network running on PlanetLab, we demonstrate that nodes in a volunteer-based network are heterogeneous in terms of several metrics, such as bandwidth, load, and capacity, which impact their download behavior. We propose new server selection heuristics that incorporate these metrics, and demonstrate that these heuristics outperform traditional proximity-based server selection, reducing average makespans by at least 30%. We further show that incorporating information about download concurrency avoids overloading servers, and improves performance by about 17-43% over heuristics considering only proximity and bandwidth.","PeriodicalId":278535,"journal":{"name":"Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129164727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Integrated Approach for Managing Peer-to-Peer Desktop Grid Systems","authors":"S. Schulz, W. Blochinger","doi":"10.1109/CCGRID.2007.21","DOIUrl":"https://doi.org/10.1109/CCGRID.2007.21","url":null,"abstract":"In this paper we propose a comprehensive integrated management architecture for large-scale desktop grid systems. By visualization of managed entities and automation of repetitive management tasks, we deal with the additional complexity induced by the size of these systems. We introduce the concepts of peer-to-peer and disconnected management to cope with network segmentation and node volatility, which are both intrinsic to desktop grid systems. By studying real-world use cases, we demonstrate the applicability of our solution.","PeriodicalId":278535,"journal":{"name":"Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117179995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamic Condor-based Services for Distributed Image Analysis","authors":"Simon Caton, O. Rana, B. Batchelor","doi":"10.1109/CCGRID.2007.44","DOIUrl":"https://doi.org/10.1109/CCGRID.2007.44","url":null,"abstract":"Interactive image processing is an important requirement in many industrial applications, such as the inspection of industrial parts within a manufacturing environment, or the processing of images from surveillance cameras. Being able to achieve this quickly and accurately is often essential for the success of such industrial applications. A service-based approach that autonomously launches Image Analysis Services (accessible through a Central Service Manager) onto spare network resources through a Condor system is presented. This allows high throughput analysis of these images in a dynamic resource pool. The Central Service Manager reacts to new tasks submitted to the Image Analysis Services and is able to add new service instances to manage these tasks dynamically. Each service instance here corresponds to a computational resource that is able to execute image processing algorithms. New service instances may be requested by the Central Service Manager from the Condor system, based on the number of tasks that need to be processed. This enables entire image repositories to be acted upon interactively and in parallel, as opposed to the analysis of single images individually. The approach is demonstrated through a campus-wide test bed utilising a Condor system with 90 machines.","PeriodicalId":278535,"journal":{"name":"Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124598739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Access Control Policy Combinations for the Grid Using the Policy Machine","authors":"Vincent C. Hu, David F. Ferraiolo, K. Scarfone","doi":"10.1109/CCGRID.2007.15","DOIUrl":"https://doi.org/10.1109/CCGRID.2007.15","url":null,"abstract":"Many researchers have tackled the architecture and requirements aspects of grid security, concentrating on the authentication or authorization mediation instead of authorization techniques, especially the topic of policy combination. Policy combination is an essential requirement of grid, not only because of the required remote (or global) vs. local interaction between grid members, but also the dynamic scalability nature of handling the joining and leaving of grid membership. However, evolving from the general security requirements of grid, the independency of a grid member's access control system is critical and needs to be maintained when the access decision is determined by the combination of global and local access control policies. The Policy Machine (PM) provides features which not only can meet the significant independency requirement but also have better performance, easier management, and more straightforward policy expression than most of the popular policy combination techniques for grid.","PeriodicalId":278535,"journal":{"name":"Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121570840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integrated Data Reorganization and Disk Mapping for Reducing Disk Energy Consumption","authors":"S. Son, M. Kandemir","doi":"10.1109/CCGRID.2007.64","DOIUrl":"https://doi.org/10.1109/CCGRID.2007.64","url":null,"abstract":"Increasing power consumption of high-performance systems leads to reliability, survivability, and cooling related problems. Motivated by this observation, several recent efforts focused on reducing disk power consumption through hardware, OS and compiler based techniques. This paper presents a novel approach to reducing disk power consumption of large-scale, array-intensive scientific applications. It proposes and evaluates a compiler-based approach that employs two complementary techniques: data reorganization and disk mapping. The first of these, data reorganization, determines a suitable layout for data in the array space, whereas the second technique, disk mapping, decides the corresponding layout in the disk space. The goal of data reorganization and disk mapping is to ensure that data (from the different disk-resident arrays) that are accessed within the same loop iteration are colocated in the same set of disks. In this way, we can increase disk inter-access times (idle periods of disks) and this in turn allows better exploitation of the underlying hardware mechanisms used for reducing power. Our experiments with eight disk I/O-intensive scientific applications indicate that the proposed approach brings significant reductions in energy consumption, whether the underlying disk system uses spin-down disks or speed-reduced disks, two previously- proposed hardware-based disk power reduction schemes. The results also show that both the components of our scheme (data reorganization and disk mapping) are very important since applying any of these components alone does not generate large savings for most of our applica tions.","PeriodicalId":278535,"journal":{"name":"Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117147084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}