L. Weng, G. Agrawal, Ümit V. Çatalyürek, T. Kurç, S. Narayanan, J. Saltz
{"title":"An approach for automatic data virtualization","authors":"L. Weng, G. Agrawal, Ümit V. Çatalyürek, T. Kurç, S. Narayanan, J. Saltz","doi":"10.1109/HPDC.2004.2","DOIUrl":"https://doi.org/10.1109/HPDC.2004.2","url":null,"abstract":"Analysis of large and/or geographically distributed scientific datasets is emerging as a key component of grid computing. One challenge in this area is that scientific datasets are typically stored as binary or character flat-files, which makes specification of processing much harder. In view of this, there has been recent interest in data virtualization, and data services to support such virtualization. This paper presents an approach for automatically creating data services to support data virtualization. Specifically, we show how a relational table like data abstraction can be supported for complex multidimensional scientific datasets that are resident on a cluster. We have designed and implemented a tool that processes SQL queries (with select and where statements) on multi-dimensional datasets. We have designed a meta-data description language that is used for specifying the data layout. From such description, our tool automatically generates efficient data subsetting and access functions. We have extensively evaluated our system. The key observations from our experiments are as follows. First, our tool can correctly and efficiently handle a variety of different data layouts. Second, our system scales well as the number of nodes or the amount of data is scaled. Third, the performance of the automatically generated code for indexing and contracting functions is quite comparable to the performance of hand-written codes.","PeriodicalId":446429,"journal":{"name":"Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004.","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124466466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xi Zhang, T. Kurç, T. Pan, Ümit V. Çatalyürek, S. Narayanan, P. Wyckoff, J. Saltz
{"title":"Strategies for using additional resources in parallel hash-based join algorithms","authors":"Xi Zhang, T. Kurç, T. Pan, Ümit V. Çatalyürek, S. Narayanan, P. Wyckoff, J. Saltz","doi":"10.1109/HPDC.2004.34","DOIUrl":"https://doi.org/10.1109/HPDC.2004.34","url":null,"abstract":"Hash-based join is a compute- and memory-intensive algorithm. It achieves good performance and scales well to large datasets, if sufficient memory is available to hold the hash table and the distribution of computing had across nodes is balanced. We compare three adaptive algorithms that start with a partitioning of the hash table across a group of nodes and expand during the hash table building phase to additional resources, when memory on a node is used up. The split-based algorithm partitions the hash table range assigned to the node, on which memory is full, into two segments and assigns one of the segments to a new node in the system. The replication-based algorithm replicates the hash table range on a new node. The hybrid algorithm combines the first and second strategies in order to address each strategy's short comings. We perform an experimental performance evaluation of these algorithms on a PC cluster. Our results show that among the three algorithms, in most cases the hybrid algorithm either performs close to the better of the two or is the best algorithm.","PeriodicalId":446429,"journal":{"name":"Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004.","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127978298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Discouraging free riding in a peer-to-peer CPU-sharing grid","authors":"N. Andrade, F. Brasileiro, W. Cirne, M. Mowbray","doi":"10.1109/HPDC.2004.9","DOIUrl":"https://doi.org/10.1109/HPDC.2004.9","url":null,"abstract":"Grid computing has excited many with the promise of access to huge amounts of resources distributed across the globe. However, there are no largely adopted solutions for automatically assembling grids, and this limits the scale of today's grids. Some argue that this is due to the overwhelming complexity of the proposed economy-based solutions. Peer-to-peer grids Iwve emerged as a less complex alternative. We are currently deploying OurGrid, one such peer-to-peer grid. OurGrid is a CPU-sharing grid that targets bag-of-tasks applications (i.e. parallel applications whose tasks are independent). In order to ease system deployment, OurGrid is based on a very lightweight autonomous reputation scheme. Free riding is an important issue for any peer-to-peer system. The aim is to show that OurGrid's reputation system successfully discourages free riding, making it in each peer s own interest to collaborate with the peer-to-peer community. We show this in two steps. First, we analyze the conditions under which a reputation scheme can discourage free riding in a CPU-sharing grid. Second, we show that OurGrid's reputation scheme satisfies these conditions, even in the presence of malicious peers. Unlike other distributed mechanisms for discouraging free riding, OurGrid's reputation scheme achieves this without requiring a shared cryptographic infrastructure or specialized storage.","PeriodicalId":446429,"journal":{"name":"Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132736264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"WS-ResourceFramework on .NET","authors":"G. Wasson, N. Beekwilder, M. Morgan, M. Humphrey","doi":"10.1109/HPDC.2004.42","DOIUrl":"https://doi.org/10.1109/HPDC.2004.42","url":null,"abstract":"The WSRF specifications [Foster, I. et al., (2004)] represent the merging of \"the Web \" and \"the grid\". This poster describes a design to achieve compliance with the WS-ResourceFramework specifications using Microsoft .NET technologies. Our design seeks to leverage Microsoft tools wherever possible and to make WSRF compliant services easy to program. While our work on OGSI.NET [Wasson, G. et al., (2004)] provides invaluable insight that guides the design of WSRF.NET, we feel that a different set of abstractions are necessary to capture the full potential of the WS-ResourceFramework. This poster describes our work to date on WSRF.NET The poster discusses topics such as the implementation of WS-Resources, the WSRF.NET programming model, our security architecture and our future release plans (including our first release at HPDC 13).","PeriodicalId":446429,"journal":{"name":"Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004.","volume":"355 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122799531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Achieving performance consistency in heterogeneous clusters","authors":"Changxun Wu, R. Burns","doi":"10.1109/HPDC.2004.1","DOIUrl":"https://doi.org/10.1109/HPDC.2004.1","url":null,"abstract":"Hash-based randomization is a powerful technique used in clusters and distributed systems for load management. It offers uniform distribution, efficient addressing, little shared state, and scalability. However, simple hash-based randomization is unable to deal with skew and heterogeneity and, therefore, cannot achieve load balance in many environments. Virtual processors have been proposed as a solution to simple randomization's problem. We evaluate an alternative load management scheme for heterogeneous, shared-disk clusters. Our scheme directly tunes hash-based randomized load placement using a technique called adaptive, nonuniform (ANU) randomization [2003] and compares favorably to the virtual processor approach. It provides the load balancing benefits of virtual processors with less shared state. It also automatically adapts to workload and cluster configuration changes, such as failure and recovery and adding or removing servers, without human involvement. Experimental results show that our scheme outperforms virtual processors and performs comparably to prescient load-balancing algorithms. They also show that our system maintains consistent performance across all servers while moving a minimal amount of load.","PeriodicalId":446429,"journal":{"name":"Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004.","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116911235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Haupt, Anand Kalyanasundaram, Nisreen Ammari, Archana Chilukuri, Maxim Khotournenko
{"title":"SPURport","authors":"T. Haupt, Anand Kalyanasundaram, Nisreen Ammari, Archana Chilukuri, Maxim Khotournenko","doi":"10.1109/HPDC.2004.33","DOIUrl":"https://doi.org/10.1109/HPDC.2004.33","url":null,"abstract":"The poster presents a successful implementation of the SPURport - a prototype Grid Portal for the earthquake engineering community. Developed as a pert of the SPUR project, it extends functionality of the NEESgrid, which in turn, is an application of OGSI/Globus 3.0. We found that the implementation of a Grid portal is much easier when one introduces high-level middle-tier services that aggregate and coordinate lower-level services provided by the Globus toolkit. For example, our high level job submission service orchestrates resolution of logical entities to physical ones, file transfers, and data streaming prior to actual the resources allocation. We found it very useful to employ application descriptors that facilitate automatic generation of RSL documents.","PeriodicalId":446429,"journal":{"name":"Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004.","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128789056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}