{"title":"Telescoping languages: a compiler strategy for implementation of high-level domain-specific programming systems","authors":"K. Kennedy","doi":"10.1109/IPDPS.2000.845999","DOIUrl":"https://doi.org/10.1109/IPDPS.2000.845999","url":null,"abstract":"As both machines and programs have become more complex, the programming process has become correspondingly more labor-intensive. This has created a software gap between the need for new software and the aggregate capacity of the current workforce to produce it. This problem has been compounded by the slow growth of programming productivity over the past two decades. One way to bridge this gap is to make it possible end users to develop programs in high-level domain-specific programming systems. The principal impediment to the success of these systems in the past has be the poor performance of the resulting applications. To address this problem, we have developed a new compiler technology that supports script-based telescoping languages, which can be built from base languages and domain-specific libraries. By exhaustively compiling the libraries in advance, we can ensure that the performance and portability of the applications produced by such systems are high, while the compile times for scripts are acceptable to the end user These qualities are essential if script-based systems are to be practical for development of production applications.","PeriodicalId":206541,"journal":{"name":"Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133112536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design and evaluation of I/O strategies for parallel pipelined STAP applications","authors":"W. Liao, A. Choudhary, D. Weiner, P. Varshney","doi":"10.1109/IPDPS.2000.846050","DOIUrl":"https://doi.org/10.1109/IPDPS.2000.846050","url":null,"abstract":"This paper presents experimental results for a parallel pipeline STAP system with I/O task implementation using the parallel file systems on the Intel Paragon and the IBM SP. In our previous work, a parallel pipeline model was designed for radar signal processing applications on parallel computers. Based on this model, we implemented a real STAP application which demonstrated the performance scalability of this model in terms of throughput and latency. In this paper we study the effect on system performance when the I/O task is incorporated in the parallel pipeline model. There are two alternative for I/O implementation: embedding I/O in the pipeline or having a separate I/O task. From these two I/O implementations, we discovered that the latency may be improved when the structure of the pipeline is reorganized by merging multiple tasks into a single task. All the performance results shown in this paper demonstrated the scalability of parallel I/O implementation on the parallel pipeline STAP system.","PeriodicalId":206541,"journal":{"name":"Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115838649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scheduling with advanced reservations","authors":"Warren Smith, Ian T Foster, V. Taylor","doi":"10.1109/IPDPS.2000.845974","DOIUrl":"https://doi.org/10.1109/IPDPS.2000.845974","url":null,"abstract":"Some computational grid applications have very large resource requirements and need simultaneous access to resources from more than one parallel computer. Current scheduling systems do not provide mechanisms to gain such simultaneous access without the help of human administrators of the computer systems. In this work, we propose and evaluate several algorithms for supporting advanced reservation of resources in supercomputing scheduling systems. These advanced reservations allow users to request resources from scheduling systems at specific times. We find that the wait times of applications submitted to the queue increases when reservations are supported and the increase depends on how reservations are supported. Further, we find that the best performance is achieved when we assume that applications can be terminated and restarted, backfilling is performed, and relatively accurate run-time predictions are used.","PeriodicalId":206541,"journal":{"name":"Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115925797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
O. Rana, D. Walker, Maozhen Li, S. Lynden, M. Ward
{"title":"PaDDMAS: parallel and distributed data mining application suite","authors":"O. Rana, D. Walker, Maozhen Li, S. Lynden, M. Ward","doi":"10.1109/IPDPS.2000.846010","DOIUrl":"https://doi.org/10.1109/IPDPS.2000.846010","url":null,"abstract":"Discovering complex associations, anomalies and patterns in distributed data sets is gaining popularity in a range of scientific, medical and business applications. Various algorithms are employed to perform data analysis within a domain, and range from statistical to machine learning and AI based techniques. Several issues need to be addressed however to scale such approaches to large data sets, particularly when these are applied to data distributed at various sites. As new analysis techniques are identified, the core tool set must enable easy integration of such analytical components. Similarly, results from an analysis engines must be sharable, to enable storage, visualisation or further analysis of results. We describe the architecture of PaDDMAS, a component based system for developing distributed data mining applications. PaDDMAS provides a tool set for combining pre-developed or custom components using a dataflow approach, with components performing analysis, data extraction or data management and translation. Each component is wrapped as a Java/CORBA object, and has an interface defined in XML. Components can be serial or parallel objects, and may be binary or contain a more complex internal structure. We demonstrate a prototype using a neural network analysis algorithm.","PeriodicalId":206541,"journal":{"name":"Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124614964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Gray codes for torus and edge disjoint Hamiltonian cycles","authors":"M. M. Bae, B. Bose","doi":"10.1109/IPDPS.2000.846007","DOIUrl":"https://doi.org/10.1109/IPDPS.2000.846007","url":null,"abstract":"Lee distance Gray codes for k-ary n-cubes and torus networks are presented. Using these Lee distance Gray codes, it is further shown how to directly generate edge disjoint Hamiltonian cycles for a class of k-ary n-cubes, 2-D tori, and hypercubes.","PeriodicalId":206541,"journal":{"name":"Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124778163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Miyazaki, A. Takahara, S. Ishihara, S. Tani, T. Murooka, T. Fukazawa, M. Teramoto, K. Matsuhiro
{"title":"Virtual BUS: a network technology for setting up distributed resources in your own computer","authors":"T. Miyazaki, A. Takahara, S. Ishihara, S. Tani, T. Murooka, T. Fukazawa, M. Teramoto, K. Matsuhiro","doi":"10.1109/IPDPS.2000.846032","DOIUrl":"https://doi.org/10.1109/IPDPS.2000.846032","url":null,"abstract":"A novel distributed-resource abstraction environment is introduced. You can access any resource in a computer network as a memory mapped I/O device, as if it was attached to the local bus of your PC. This network technology gives us several benefits. From the application development viewpoint, no network-related programming is required, and we don't need to modify the applications even if the network topologies and protocols are changed. On the other hand, network maintenance and upgrading can be done anytime without worrying about the application users, because the environment completely separates or hides the network from the applications. The API (Application Program Interface), a resource abstraction mechanism, and a directory service are implemented. In addition, a reconfigurable hardware technology is adopted to perform autonomous network control using a lour layer protocol. Furthermore, we introduce a testbed that allows heterogeneous resources to be utilized, and demonstrate the feasibility of our concept using some applications.","PeriodicalId":206541,"journal":{"name":"Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124921972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fair and efficient packet scheduling in wormhole networks","authors":"S. Kanhere, Alpa B. Parekh, H. Sethu","doi":"10.1109/IPDPS.2000.846044","DOIUrl":"https://doi.org/10.1109/IPDPS.2000.846044","url":null,"abstract":"Most switch architectures for parallel systems are designed to eliminate only the worst kinds of unfairness such as starvation scenarios in which packets belonging to one traffic flow may not make forward progress for an indefinite period of time. However stricter fairness can lead to a more predictable and better performance, in addition to improving isolation between traffic belonging to different users. This paper presents a new easily implementable scheduling discipline, called Elastic Round Robin (ERR), for the unique requirements of wormhole switching, popular in interconnection networks for parallel systems. Despite the constraints of wormhole switching imposed on the design, our scheduling discipline is at least as efficient as other scheduling disciplines, and more fair than scheduling disciplines of comparable efficiency proposed for any other kind of network, including the Internet. We prove that the work complexity of ERR is O(1) with respect to the number of flows. We analytically prove the fairness properties of ERR, and show that its relative fairness measure has an upper bound of 3 m, where m is the size of the largest packet that actually arrives during an execution of ERR. Finally, we present simulation results comparing the fairness and performance characteristics of ERR with other scheduling disciplines of comparable efficiency.","PeriodicalId":206541,"journal":{"name":"Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122891057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring the switch design space in a CC-NUMA multiprocessor environment","authors":"Marius Pirvu, N. Ni, L. Bhuyan","doi":"10.1109/IPDPS.2000.846055","DOIUrl":"https://doi.org/10.1109/IPDPS.2000.846055","url":null,"abstract":"The switch design for interconnection networks plays an important role in the overall performance of multiprocessors and computer networks. It is therefore crucial to study various factors in the switch design space and their influence on the system performance. In this paper we first propose a 4-D framework for the design of input queuing switches with wormhole routing and virtual channels. Then we explore the design space to examine in detail the impact of four parameters: virtual channel allocation, intraswitch connectivity buffer space allocation and link arbitration policy. Our simulations, performed with an execution driven simulator with ILP processors, show that the cumulative effect of the four switch enhancements ranges between 7% and 38%. The most important parameter proves to be VC allocation method (up to 28% improvements in execution time). The other three bring about the same level of performance: between 1% and 7% depending on the application.","PeriodicalId":206541,"journal":{"name":"Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125138798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Wavelengths requirement for permutation routing in all-optical multistage interconnection networks","authors":"Q. Gu, S. Peng","doi":"10.1109/IPDPS.2000.846062","DOIUrl":"https://doi.org/10.1109/IPDPS.2000.846062","url":null,"abstract":"Previous studies showed that the cross-talk problem on the all-optical networks exists at both links and switches of the networks. To solve the cross-talk problem at both links and switches, one approach is to assign the wavelengths to the communication paths so that the paths which receive the same wavelength are node-disjoint. Our goal is to minimize the number of wavelengths required for permutation routings by node-disjoint paths on all-optical MINs which consists of n stages of 2/spl times/2 switches connecting N=2/sup n/ inputs and outputs. We prove that the problem of finding the minimum number of wavelengths for arbitrary partial permutation routings on the MINs is NP-complete. We show that any partial permutation routing can be realized by 2/sup [n/2]/ wavelengths and there exist permutation routings that require at least 2/sup [n/2]/ wavelengths. Although the general problem is NP-complete, we give an efficient algorithm for computing the minimum number of wavelengths for the class of BPC (bit permute-complement) permutations.","PeriodicalId":206541,"journal":{"name":"Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116812386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multiclock Esterel: a reactive framework for asynchronous design","authors":"B. Rajan, R. Shyamasundar","doi":"10.1109/IPDPS.2000.845982","DOIUrl":"https://doi.org/10.1109/IPDPS.2000.845982","url":null,"abstract":"In this paper, we discuss a new paradigm called Multiclock Esterel, based on the paradigm of the synchronous reactive language, Esterel, used for reactive systems and synchronous circuit design. We show that the Multiclock Esterel paradigm provides a general framework for the design of systems with multiple local clocks and the earlier paradigm of CRP (Communicating Reactive Processes) can be obtained as an instance of the newly proposed paradigm. Furthermore, it preserves the advantages of the classical Esterel paradigm and thus benefits from the advantages of verifiability of specifications/models. Multiclock Esterel provides a formal basis for designing asynchronous circuits and provides a succinct unification of synchrony and asynchrony.","PeriodicalId":206541,"journal":{"name":"Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117112937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}