{"title":"Integrating list heuristics into genetic algorithms for multiprocessor scheduling","authors":"Ricardo C. Corrêa, Afonso Ferreira, P. Rebreyend","doi":"10.1109/SPDP.1996.570369","DOIUrl":"https://doi.org/10.1109/SPDP.1996.570369","url":null,"abstract":"In the multiprocessor scheduling problem a given program is to be scheduled in a multiprocessor system such that the program's execution time is minimized. This problem being very hard to solve exactly, many heuristic methods for finding a suboptimal schedule exist. The authors propose a new combined approach, where a genetic algorithm is improved with the introduction of some knowledge about the scheduling problem represented by the use of a list heuristic in the crossover and mutation genetic operations. This knowledge-augmented genetic approach is empirically compared with a \"pure\" genetic algorithm and with a \"pure\" list heuristic, both from the literature. Results of the experiments carried out with synthetic instances of the scheduling problem show that the genetic algorithm produces much better results in terms of quality of solutions, although being slower in terms of execution time.","PeriodicalId":360478,"journal":{"name":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123540066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real-time sonar beamforming on a MasPar architecture","authors":"J. Salinas, R. Bernecky","doi":"10.1109/SPDP.1996.570338","DOIUrl":"https://doi.org/10.1109/SPDP.1996.570338","url":null,"abstract":"This paper presents a novel approach for performing real-time sonar beamforming on linear sensor arrays using the MasPar SIMD architecture. The beamforming problem is defined as a three dimensional solution space by generating a cube structure with sonar array elements as one dimension, the required beams in another dimension, and the time samples in the third dimension. The given approach maps the problem cube into the MasPar structure using a modified one-to-one mapping and uses two MasPar Fortran 90 intrinsic array functions to generate the solutions to the beams. Simulation results are provided for different array and beam sizes.","PeriodicalId":360478,"journal":{"name":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125909662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Load-balancing in sparse matrix-vector multiplication","authors":"S. Nastea, O. Frieder, T. El-Ghazawi","doi":"10.1109/SPDP.1996.570337","DOIUrl":"https://doi.org/10.1109/SPDP.1996.570337","url":null,"abstract":"We consider the load-balanced multiplication of a large sparse matrix with a large sequence of vectors, on parallel computers. Due to the associated computational and inter-node communication challenges, we propose a method that combines fast load-balanced work allocation with efficient message passing implementations. The performance of the proposed method was evaluated on benchmark matrices as well as on synthetically generated matrix data. We compare our proposed allocation solution with previous research work. It is shown that, by using our approach, a tangible improvement over prior work can be obtained, particularly for very sparse and skewed matrices.","PeriodicalId":360478,"journal":{"name":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125677886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Implementing cooperative software with high-level communication packages","authors":"A. Forst, E. Kühn","doi":"10.1109/SPDP.1996.570317","DOIUrl":"https://doi.org/10.1109/SPDP.1996.570317","url":null,"abstract":"The use of appropriate tools is crucial for the development of robust and distributed software. The programming of heterogeneous environments is more demanding than programming single, stand-alone computers. We believe that client/server technology is not a satisfactory solution. Most problems do not naturally decompose into an asymmetric client/server structure. Better abstraction mechanisms are needed. We propose a new coordination framework that we have developed. It supports shared objects as reliable communication media, advanced transactions, and concurrency through processes that form reliable software contracts. For a discussion, we compare the realization of a typical distributed application, that belongs to the domain of cooperative work, with three different tools: our coordination framework; a representative of the classical client/server and message paradigm; and the Linda communication model.","PeriodicalId":360478,"journal":{"name":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129357440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The direct dimension exchange method for load balancing in k-ary n-cubes","authors":"Minyou Wu, W. Shu","doi":"10.1109/SPDP.1996.570356","DOIUrl":"https://doi.org/10.1109/SPDP.1996.570356","url":null,"abstract":"The dimension exchange method (DEM) was initially proposed as a load-balancing algorithm for the hypercube structure. It has been generalized to k-ary n-cubes. However the k-ary n-cube algorithm must take many iterations to converge to a balanced state. In this paper we propose a direct method to modify DEM. The new algorithm Direct Dimension Exchange (DDE) method, takes load average in every dimension to eliminate unnecessary load exchange. It balances the load directly without iteratively exchanging the load. This global approach is able to balance the load more accurately and much faster.","PeriodicalId":360478,"journal":{"name":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133383062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A new fixed degree regular network for parallel processing","authors":"S. Latifi, P. Srimani","doi":"10.1109/SPDP.1996.570328","DOIUrl":"https://doi.org/10.1109/SPDP.1996.570328","url":null,"abstract":"We propose a family of regular Cayley network graphs of degree three based an permutation groups for design of massively parallel systems. These graphs are shown to be based on the shuffle exchange operations, to have logarithmic diameter in the number of vertices, and to be maximally fault tolerant. We investigate different algebraic properties of these networks (including fault tolerance) and propose a simple routing algorithm. These graphs are shown to be able to efficiently simulate other permutation group based graphs; thus they seem to be very attractive for VLSI implementation and for applications requiring bounded number of I/O ports as well as to run existing applications for other permutation group based architectures.","PeriodicalId":360478,"journal":{"name":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127061602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Almost two-state self-stabilizing algorithm for token rings","authors":"G. Alari, A. Datta","doi":"10.1109/SPDP.1996.570316","DOIUrl":"https://doi.org/10.1109/SPDP.1996.570316","url":null,"abstract":"A self-stabilizing distributed system is a network of processors, which, regardless of its initial global state, will achieve the desired state in a finite number of steps. There are two main performance issues in the design of a self-stabilizing system: the stabilization time and memory requirements (the number of states required by each process). We first show that the probabilistic two-state algorithm for asynchronous, unidirectional token rings stabilizes only in systems where k, the upper bound for the ratio of the speeds of any two processes, exists, but is unknown, and neither the convergence time nor token circulation delay of this algorithm can be bounded. Then we present an almost two-state self-stabilizing algorithm for unidirectional token rings. The processes move synchronously and k is known. The algorithm requires each process in the ring to have two states; one process, called the exceptional process, needs an additional integer variable of size O(n), where n is the number of nodes in the ring; the algorithm stabilizes in O(n) time and achieves an O(kn) token circulation delay.","PeriodicalId":360478,"journal":{"name":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128137069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V. Bokka, H. Gurla, S. Olariu, J. Schwing, L. Wilson
{"title":"A unifying methodology for multiple querying on enhanced meshes","authors":"V. Bokka, H. Gurla, S. Olariu, J. Schwing, L. Wilson","doi":"10.1109/SPDP.1996.570360","DOIUrl":"https://doi.org/10.1109/SPDP.1996.570360","url":null,"abstract":"The main contribution of this work is to show that a number of seemingly unrelated problems in database design, pattern recognition, robotics, and image processing can be solved simply and elegantly by formulating them as instances of a general problem-the multiple query (MQ) problem. An arbitrary instance of the multiple query problem consists of a collection A={a/sub 1/, a/sub 2/, ..., a/sub n/} of items, a collection Q={q/sub 1/, q/sub 2/, ..., q/sub m/} (1/spl les/m/spl les/n) of queries, a decision problem /spl phi/:Q/spl times/A/spl rarr/{\"yes\", \"no\"}, and an associative and commutative function f operating on subsets of A. For every query q/sub i/, let S/sub i/ be the set of items a/sub j/ in A for which /spl phi/(q/sub i/, a/sub j/)=\"yes\". The solution of q/sub i/ is defined to be f(S/sub i/). In this context, the multiple query problem involves solving all the queries in Q. We begin by showing that if the collections A and Q are stored one item and at most one query per processor on a mesh with multiple broadcasting of size /spl radic/n/spl times//spl radic/n then any algorithm that solves the MQ problem requires /spl Omega/(m1/3n1/6) time in the worst case. Second, we show that a number of fundamental problems can be solved simply and elegantly by formulating them as instances of the MQ problem.","PeriodicalId":360478,"journal":{"name":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129529513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The \"express channel\" concept in hypermeshes and k-ary n-cubes","authors":"S. Loucif, L. Mackenzie, M. Ould-Khaoua","doi":"10.1109/SPDP.1996.570385","DOIUrl":"https://doi.org/10.1109/SPDP.1996.570385","url":null,"abstract":"Low-dimensional k-ary n-cubes have been popular in recent multicomputers. However these networks suffer from high switching delays due to their high message distance. To overcome this problem, Dally (1990) has proposed express k-ary n-cubes with express channels, that allow non-local messages to partially bypass clusters of nodes within a dimension. The paper argues that hypergraph topologies, that provide total bypasses within a dimension, represent potential candidates as future high-performance networks. It presents a comparative study, of a regular hypergraph, referred to as the distributed crossbar switch hypermesh (DCSH), and the express k-ary n-cube, taking into account channel bandwidth constraints which apply in VLSI and multiple-chip technology. The study concludes that the DCSH's total bypass strategy yields superior performance characteristics to the partial bypassing of its express cube counterpart.","PeriodicalId":360478,"journal":{"name":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126297925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sorting N items using a p-sorter in optimal time","authors":"S. Olarin, Si-Qing Zheng","doi":"10.1109/SPDP.1996.570343","DOIUrl":"https://doi.org/10.1109/SPDP.1996.570343","url":null,"abstract":"A sorting device capable of sorting p items in constant time is called a p-sorter. It is known that the task of sorting N items using a p-sorter requires at least /spl Omega/ (N log N/p log p) applications of the p-sorter. This bound is tight: there exist algorithms that use O (N log N/p log p) calls to the p-sorter to sort N items. However, there is no known implementable algorithm that can sort N items in O(N log N/p log p) time using a p-sorter. The main contribution of this paper is to propose a simple VLSI architecture and to show that in our architecture N items can be sorted in O(N log N/p log p) calls to the p-sorter, while enforcing conflict-free memory accesses. An important feature of our design is that the total additional VLSI area for hardware, other than the memory for data and the p-sorter, is kept to a minimum.","PeriodicalId":360478,"journal":{"name":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115835347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}