{"title":"A scalable and efficient self-organizing failure detector for grid applications","authors":"Yuuki Horita, K. Taura, T. Chikayama","doi":"10.1109/GRID.2005.1542743","DOIUrl":"https://doi.org/10.1109/GRID.2005.1542743","url":null,"abstract":"Failure detection and group membership management are basic building blocks for self-repairing systems in distributed environments, which need to be scalable, reliable, and efficient in practice. As available resources become larger in size and more widely distributed, it is more essential that they can be easily used with a small amount of manual configuration in grid environments, where connectivities between different networks may be limited by firewalls and NATs. In this paper, we present a scalable failure detection protocol that self-organizes in grid environments. Our failure detectors autonomously create dispersed monitoring relationships among participating processes with almost no manual configuration so that each process will be monitored by a small number of other processes, and quickly disseminate notifications along the monitoring relationships when failures are detected. With simulations and real experiments, we showed that our failure detector has a practical scalability, a high reliability, and a good efficiency. The overhead with 313 processes was at most 2-percent even when the heartbeat interval was set to 0.1 second, and accordingly smaller when it was longer.","PeriodicalId":347929,"journal":{"name":"The 6th IEEE/ACM International Workshop on Grid Computing, 2005.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129788183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rashid Mehmood, J. Crowcroft, S. Hand, Steven Smith
{"title":"Grid-level computing needs pervasive debugging","authors":"Rashid Mehmood, J. Crowcroft, S. Hand, Steven Smith","doi":"10.1109/GRID.2005.1542741","DOIUrl":"https://doi.org/10.1109/GRID.2005.1542741","url":null,"abstract":"Developing applications for parallel and distributed systems is hard due to their nondeterministic nature; developing debugging tools for such systems and applications is even harder. A number of distributed debugging tools and techniques exist; however, we believe that they lack the infrastructure to scale to large-scale distributed systems, systems with hundreds and thousands of nodes, such as grids. In this paper, we introduce PDB, our prototype debugger, which is based on a hierarchical, scalable architecture. We explain the design of the PDB, highlight its functionality, and demonstrate its usability with two case studies. Before concluding, we discuss portability and extensibility issues for PDB, and discuss some solutions.","PeriodicalId":347929,"journal":{"name":"The 6th IEEE/ACM International Workshop on Grid Computing, 2005.","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132815974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anand Padmanabhan, Shaowen Wang, Sukumar Ghosh, R. Briggs
{"title":"A self-organized grouping (SOG) method for efficient Grid resource discovery","authors":"Anand Padmanabhan, Shaowen Wang, Sukumar Ghosh, R. Briggs","doi":"10.1109/GRID.2005.1542762","DOIUrl":"https://doi.org/10.1109/GRID.2005.1542762","url":null,"abstract":"This paper presents a self-organized grouping (SOG) method that achieves efficient Grid resource discovery by forming and maintaining autonomous resource groups. Each group dynamically aggregates a set of resources that are similar to each other in some pre-specified resource characteristic. The SOG method takes advantage of the strengths of both centralized and decentralized approaches that were previously developed for Grid/P2P resource discovery. The design of the SOG method minimizes the overhead incurred in forming and maintaining groups and maximizes resource discovery performance. The way SOG method handles resource discovery queries is metaphorically similar to searching for a word in an English dictionary by identifying its alphabetical groups at the first place. It is shown from a series of computational experiments that SOG method achieves more stable (i.e., independent of the factors such as resource densities, and Grid sizes) and efficient lookup performance than other existing approaches.","PeriodicalId":347929,"journal":{"name":"The 6th IEEE/ACM International Workshop on Grid Computing, 2005.","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131777527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. F. Silva, L. Gaspary, M. Barcellos, André Detsch
{"title":"Policy-based access control in peer-to-peer grid systems","authors":"J. F. Silva, L. Gaspary, M. Barcellos, André Detsch","doi":"10.1109/GRID.2005.1542731","DOIUrl":"https://doi.org/10.1109/GRID.2005.1542731","url":null,"abstract":"Access control to resources is one of the most important requirements to be satisfied in grid systems that span over multiple administrative domains. Such a mechanism allows every institution taking part of a grid community to define and enforce policies for the use of their local resources by remote users. Despite the efforts of the research community to address this topic, existing approaches do not scale (e.g., in terms of communication overhead) for a large number of nodes (peers) providing resources, as these approaches rely on centralized servers to process access requests. Furthermore, they provide limited, large-grain policy specification functionality and are not committed to employing open, standardized formats to express policies. In this paper, we address these limitations by proposing PeGAC (peer-to-peer grid access control), a policy-based, distributed access control mechanism, which can be applied to P2P grid systems. In our proposal, policies are specified using the role-based access control model and coded using the extensible access control markup language. As a proof-of-concept we have integrated PeGAC into OurGrid, a middleware for the implementation of P2P grid systems. Preliminary results of experiments carried out at the resulting infrastructure show that our solution poses small communication and processing overhead, and can handle large policy repositories efficiently.","PeriodicalId":347929,"journal":{"name":"The 6th IEEE/ACM International Workshop on Grid Computing, 2005.","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132219085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Saleve: simple Web-services based environment for parameter study applications","authors":"Z. Molnár, I. Szeberényi","doi":"10.1109/GRID.2005.1542757","DOIUrl":"https://doi.org/10.1109/GRID.2005.1542757","url":null,"abstract":"The goal of the Saleve Project is to develop and evaluate mechanisms and abstractions that may connect the diverse research community of the distributed (mainly the grid) computing to those users, who are not familiar with distributed computing as such, but who would simply like to use the results in their everyday tasks. We show a simple Web-services based, domain-specific computational framework that integrates smoothly into the well-known, traditional user environments and requires learning no new technologies.","PeriodicalId":347929,"journal":{"name":"The 6th IEEE/ACM International Workshop on Grid Computing, 2005.","volume":"2000 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128273129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Krishnan, K. Baldridge, J. Greenberg, B. Stearn, K. Bhatia
{"title":"An end-to-end Web services-based infrastructure for biomedical applications","authors":"S. Krishnan, K. Baldridge, J. Greenberg, B. Stearn, K. Bhatia","doi":"10.1109/GRID.2005.1542727","DOIUrl":"https://doi.org/10.1109/GRID.2005.1542727","url":null,"abstract":"Services-oriented architectures hold a lot of promise for grid-enabling scientific applications. In recent times, Web services have gained wide-spread acceptance in the grid community as the standard way of exposing application functionality to end-users. Web services-based architectures provide accessibility via a multitude of clients, and the ability to enable composition of data and applications in novel ways for facilitating innovation across scientific disciplines. However, issues of diverse data formats and styles which hinder interoperability and integration must be addressed. Providing Web service wrappers for legacy applications alleviates many problems because of the exchange of strongly typed data, defined and validated using XML schemas, that can be used by workflow tools for application integration. In this paper, we describe the end-to-end architecture of such a system for biomedical applications that are part of the National Biomedical Computation Resource (NBCR). We present the technical challenges in setting up such an infrastructure, and discuss in detail the back-end resource management, application services, user-interfaces, and the security infrastructure for the same. We also evaluate our prototype infrastructure, discuss some of its shortcomings, and the future work that may be required to address them.","PeriodicalId":347929,"journal":{"name":"The 6th IEEE/ACM International Workshop on Grid Computing, 2005.","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132827406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Peer-to-peer discovery of computational resources for Grid applications","authors":"Adeep S. Cheema, M. Muhammad, Indranil Gupta","doi":"10.1109/GRID.2005.1542740","DOIUrl":"https://doi.org/10.1109/GRID.2005.1542740","url":null,"abstract":"Grid applications need to discover computational resources quickly, efficiently and scalably, but most importantly in an expressive manner. An expressive query may specify a variety of required metrics for the job, e.g., the number of hosts required, the amount of free CPU required on these hosts, and the minimum amount of RAM required on these hosts, etc. We present a peer-to-peer (P2P) solution to this problem, using structured naming to enable both (1) publishing of information about available computational resources, as well as (2) expressive and efficient querying of such resources. Extensive traces collected from hosts within the Computer Science department at UIUC are used to evaluate our proposed solution. Finally, our solutions are based upon a well known P2P system called Pastry, albeit for Grid applications; this is another step towards the much-needed convergence of Grid and P2P computing.","PeriodicalId":347929,"journal":{"name":"The 6th IEEE/ACM International Workshop on Grid Computing, 2005.","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130593301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Addressing credential revocation in grid environments","authors":"B. Sundaram, B. Chapman","doi":"10.1109/GRID.2005.1542764","DOIUrl":"https://doi.org/10.1109/GRID.2005.1542764","url":null,"abstract":"Credential revocation is a critical problem in grid environments and remains unaddressed in existing grid security solutions. We present our ongoing work in designing a novel grid authentication system, based on Globus GSI, that solves the revocation problem. The focus of this work is to ensure instantaneous revocation of both long-term digital identities of hosts/users and short-lived identities of user proxies. Our system employs mediated RSA (mRSA), adapts Boneh's notion of semi-trusted mediators to suit security in virtual organizations and propagates proxy revocation information as in Micali's NOVO-MODO system. We envision that our system would additionally provide a configuration-free security model for end users and fine-grained management of user credentials.","PeriodicalId":347929,"journal":{"name":"The 6th IEEE/ACM International Workshop on Grid Computing, 2005.","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125020675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient response time predictions by exploiting application and resource state similarities","authors":"Hui Li, D. Groep, L. Wolters","doi":"10.1109/GRID.2005.1542747","DOIUrl":"https://doi.org/10.1109/GRID.2005.1542747","url":null,"abstract":"In large-scale grids with many possible resources (clusters of computing elements) to run applications, it is useful that the resources can provide predictions of job response times so users or resource brokers can make better scheduling decisions. Two metrics need to be estimated for response time predictions: one is how long a job executes on the resource (application run time), the other is how long the job waits in the queue before starting (queue wait time). In this paper we propose an instance based learning technique to predict these two metrics by mining historical workloads. The novelty of our approach is to introduce policy attributes in representing and comparing resource states, which is defined as the pool of running and queued jobs on the resource at the time to make a prediction. The policy attributes reflect the local resource scheduling policies and they can be automatically discovered using a genetic search algorithm. The main advantages of this approach compared with scheduler simulation are two-folds: Firstly, it has a better performance to meet the real time requirement of Grid resource brokering; secondly, it is more general because the scheduling policies are learned from past observations. Our experimental results on the NIKHEF LCG production cluster show that acceptable prediction accuracy can be obtained, where the relative prediction errors for response times are between 0.35 and 0.70.","PeriodicalId":347929,"journal":{"name":"The 6th IEEE/ACM International Workshop on Grid Computing, 2005.","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130274132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Authorization of data access in distributed storage systems","authors":"D. Feichtinger, A. Peters","doi":"10.1109/GRID.2005.1542739","DOIUrl":"https://doi.org/10.1109/GRID.2005.1542739","url":null,"abstract":"This paper describes an efficient method for access authorization in distributed (grid) storage systems. Client applications obtain \"access tokens\" from an organization's file catalogue upon execution of a file name resolution request. Whenever a client application tries to access the requested files, the token is transparently passed to the target storage system. Thus the storage service can decide on the authorization of a request without itself having to contact the authorization service. The token is protected from access and modification by external parties using public key infrastructure. A prototype using the AliEn grid file catalogue and xrootd as a data server has been implemented. A detailed description of the prototype implementation is presented.","PeriodicalId":347929,"journal":{"name":"The 6th IEEE/ACM International Workshop on Grid Computing, 2005.","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127436641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}