{"title":"The maximal utilization of processor co-allocation in multicluster systems","authors":"A. Bucur, D. Epema","doi":"10.1109/IPDPS.2003.1213154","DOIUrl":"https://doi.org/10.1109/IPDPS.2003.1213154","url":null,"abstract":"In systems consisting of multiple clusters of processors which employ space sharing for scheduling jobs, such as our distributed ASCI supercomputer (DAS), co-allocation, i.e., the simultaneous allocation of processors to single jobs in multiple clusters, may be required. In studies of scheduling in single clusters it has been shown that the achievable (maximal) utilization may be much less than 100%, a problem that may be aggravated in multicluster systems. In this paper we study the maximal utilization when co-allocating jobs in multicluster systems, both with analytic means (we derive exact and approximate formulas when the service-time distribution is exponential), and with simulations with synthetic workloads and with workloads derived from the logs of actual systems.","PeriodicalId":177848,"journal":{"name":"Proceedings International Parallel and Distributed Processing Symposium","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134313886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Donald R. Jones, E. Jurrus, B. Moon, Kenneth A. Perrine
{"title":"Gigapixel-size real-time interactive image processing with parallel computers","authors":"Donald R. Jones, E. Jurrus, B. Moon, Kenneth A. Perrine","doi":"10.1109/IPDPS.2003.1213426","DOIUrl":"https://doi.org/10.1109/IPDPS.2003.1213426","url":null,"abstract":"The parallel computational environment for imaging science, PiCEIS, is an image processing package designed for efficient execution on massively parallel computers. Through effective use of the aggregate resources of such computers, PiCEIS enables much larger and more accurate production processing using existing off the shelf hardware. Goals of PiCEIS are to decrease the difficulty of writing scalable parallel programs, reduce the time to add new functionalities, and provide for real-time interactive image processing. In part this is accomplished by the PiCEIS architecture, its ability to easily add additional modules, and the use of a shared-memory programming model based upon one-sided access to distributed shared memory. In this paper, we briefly describe the PiCEIS architecture and our shared memory programming tools and examine some typical techniques and algorithms. Initial image processing performance testing is encouraging - for very large image files, processing time is less than 10 seconds.","PeriodicalId":177848,"journal":{"name":"Proceedings International Parallel and Distributed Processing Symposium","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131890512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Better real-time response for time-share scheduling","authors":"Scott A. Banachowski, S. Brandt","doi":"10.1109/IPDPS.2003.1213246","DOIUrl":"https://doi.org/10.1109/IPDPS.2003.1213246","url":null,"abstract":"As computing systems of all types grow in power and complexity, it is common to want to simultaneously execute processes with different timeliness constraints. Many systems use CPU schedulers derived from time-share algorithms; because they are based on best-effort policies, these general-purpose systems provide little support for real-time constraints. This paper describes BeRate, a scheduler that integrates best-effort and soft real-time processing using a best-effort programming model in which soft real-time application parameters are inferred from runtime behavior. We show that with no a prior! information about applications, BeRate outperforms Linux when scheduling workloads containing soft real-time applications.","PeriodicalId":177848,"journal":{"name":"Proceedings International Parallel and Distributed Processing Symposium","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132244060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A comparison between MPI and OpenMP branch-and-bound skeletons","authors":"I. Dorta, C. León, C. Rodríguez","doi":"10.1109/IPDPS.2003.1213254","DOIUrl":"https://doi.org/10.1109/IPDPS.2003.1213254","url":null,"abstract":"This article describes and compares two parallel implementations of branch-and-bound skeletons. Using the C++ programming language, the user has to specify the type of the problem, the type of the solution and the specific characteristics of the branch-and-bound technique. This information is combined with the provided resolution skeletons to obtain a distributed and a shared parallel program. MPI has been used to develop the message passing algorithm and for the shared memory one OpenMP has been chosen. Computational results for the 0/1 knapsack problem on a Sunfire 6800 SMP, a Origin 3000 and a PCs cluster are presented.","PeriodicalId":177848,"journal":{"name":"Proceedings International Parallel and Distributed Processing Symposium","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133778122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance analysis of distributed search in open agent systems","authors":"V. Dimakopoulos, E. Pitoura","doi":"10.1109/IPDPS.2003.1213097","DOIUrl":"https://doi.org/10.1109/IPDPS.2003.1213097","url":null,"abstract":"In open multi-agent systems agents need resources provided by other agents but they are not aware of which agents provide the particular resources. Most solutions to this problem are based on a central directory that maintains a mapping between agents and resources. However, such solutions do not scale well since the central directory becomes a bottleneck in terms of both performance and reliability. In this paper, we introduce a different approach: each agent maintains a limited size local cache in which it keeps information about k different resources, that is, for each of k resources, it stores the contact information of one agent that provides it. This creates a directed network of caches. We address the following fundamental problem: how can an agent that needs a particular resource find an agent that provides it by navigating through this network of caches? We propose and analytically compare the performance of three different algorithms for this problem, flooding, teeming and random paths, in terms of three performance measures: the probability to locate the resource, the number of steps and the number of messages to do so. Our analysis is also applicable to distributed search in unstructured peer-to-peer networks.","PeriodicalId":177848,"journal":{"name":"Proceedings International Parallel and Distributed Processing Symposium","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133473487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhanced processor budget for QoS management in multimedia systems","authors":"Chang-Gun Lee, L. Sha","doi":"10.1109/IPDPS.2003.1213244","DOIUrl":"https://doi.org/10.1109/IPDPS.2003.1213244","url":null,"abstract":"Resource reservation and QoS negotiation is a common way to guarantee timely progress of programs in distributed multimedia systems. For this, determining the available resource capacity, resource budget, is important. The resource budget depends on resource characteristics (e.g., processor, memory, disk, and network bandwidth) and scheduling algorithms. The paper provides an improved processor budget for the fixed-priority scheduling algorithm, which is most common in commercial real-time operating systems. The improvement is possible by noting that, in multimedia systems, there is a prefixed set of task periods for the finite set of QoS options and parameters. Our approach explicitly takes these periods into account and calculates the tight bound of the processor budget using the linear programming technique. This bound significantly improves Liu and Layland bound (Liu and Layland, 1973) and also it is proved to be better than any other bounds in the literature. We also show how this bound is effectively used for resource reservation and QoS re-negotiation for adapting to dynamic workload.","PeriodicalId":177848,"journal":{"name":"Proceedings International Parallel and Distributed Processing Symposium","volume":"137 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133651901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An accurate and efficient parallel genetic algorithm to schedule tasks on a cluster","authors":"Michelle D. Moore","doi":"10.1109/IPDPS.2003.1213276","DOIUrl":"https://doi.org/10.1109/IPDPS.2003.1213276","url":null,"abstract":"Recent breakthroughs in the mathematical estimation of parallel genetic algorithm parameters by Cantu-Paz (2000) are applied to the NP-complete problem of scheduling multiple tasks on a cluster of computers connected by a shared bus. Experiments reveal that the parallel scheduling algorithm develops very accurate schedules when the parameter guidelines are used.","PeriodicalId":177848,"journal":{"name":"Proceedings International Parallel and Distributed Processing Symposium","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117103996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Use of the parallel port to measure MPI intertask communication costs in COTS PC clusters","authors":"Maya Haridasan, G. H. Pfitscher","doi":"10.1109/IPDPS.2003.1213497","DOIUrl":"https://doi.org/10.1109/IPDPS.2003.1213497","url":null,"abstract":"Performance analysis of system time parameters is important for the development of parallel and distributed programs because it provides a means of estimating program execution times and it is important for scheduling tasks on processors. Measuring time intervals between events occurring in different nodes of COTS clusters of workstations is not a trivial task due to the absence of a unified clock view. We propose a different approach to measure system time parameters and program performance in clusters with the aid of the parallel port present in every machine of a COTS cluster. Some experimental values of communication delays using the MPI library in a Linux PC cluster are presented and the efficiency and precision of the proposed mechanism are analyzed.","PeriodicalId":177848,"journal":{"name":"Proceedings International Parallel and Distributed Processing Symposium","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117108509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. D. Santo, Franco Frattolillo, N. Ranaldo, W. Russo, E. Zimeo
{"title":"Programming metasystems with active objects","authors":"M. D. Santo, Franco Frattolillo, N. Ranaldo, W. Russo, E. Zimeo","doi":"10.1109/IPDPS.2003.1213257","DOIUrl":"https://doi.org/10.1109/IPDPS.2003.1213257","url":null,"abstract":"The widespread diffusion of metasystems and grid environments makes it necessary to employ programming models able to well exploit a high, variable number of distributed heterogeneous resources. Many software frameworks designed for Grid computing do not address this problem. They only allow the use of existing programming libraries based on explicit message-passing communication models, often not suitable to manage the variability of a Grid. In this paper we present the customization of a component-based middleware for metacomputing, HiMM (Hierarchical Metacomputer Middleware), in order to support distributed programming based on the Active Object model provided by ProActive. This way a meta-system can be efficiently and transparently programmed by unifying the asynchronousremote method invocation model and the reflection provided by meta-objects.","PeriodicalId":177848,"journal":{"name":"Proceedings International Parallel and Distributed Processing Symposium","volume":"154 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115152987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Lamehamedi, Zujun Shentu, B. Szymanski, E. Deelman
{"title":"Simulation of dynamic data replication strategies in Data Grids","authors":"H. Lamehamedi, Zujun Shentu, B. Szymanski, E. Deelman","doi":"10.1109/IPDPS.2003.1213206","DOIUrl":"https://doi.org/10.1109/IPDPS.2003.1213206","url":null,"abstract":"Data Grids provide geographically distributed resources for large-scale data-intensive applications that generate large data sets. However, ensuring efficient access to such huge and widely distributed data is hindered by the high latencies of the Internet. We address these challenges by employing intelligent replication and caching of objects at strategic locations. In our approach, replication decisions are based on a cost-estimation model and driven by the estimation of the data access gains and the replica's creation and maintenance costs. These costs are in turn based on factors such as runtime accumulated read/write statistics, network latency, bandwidth, and replica size. To support large numbers of users who continuously change their data and processing needs, we introduce scalable replica distribution topologies that adapt replica placement to meet these needs. In this paper we present the design of our dynamic memory middleware and replication algorithm. To evaluate the performance of our approach, we developed a Data Grid simulator, called the GridNet. Simulation results demonstrate that replication improves the data access time in Data Grids, and that the gain increases with the size of the datasets involved.","PeriodicalId":177848,"journal":{"name":"Proceedings International Parallel and Distributed Processing Symposium","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116949461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}