{"title":"Integrated VLSI layout compaction and wire balancing on a shared memory multiprocessor: evaluation of a parallel algorithm","authors":"P. Chalasani, K. Thulasiraman, M. Corneau","doi":"10.1109/ISPAN.1994.367165","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367165","url":null,"abstract":"We first present a unified formulation to three problems in VLSI physical design: layout compaction, wire balancing and integrated layout compaction and wire balancing problem. The aim of layout compaction is to achieve minimum chip width. Whereas wire balancing seeks to achieve minimum total wire length, integrated layout compaction and wire balancing seeks to minimize wire length maintaining the chip width at the optimum value. Our formulation is in terms of the dual transshipment problem. We then review our recent work on a parallel algorithm for the dual transshipment problem. We show how this algorithm called Modified Network Dual Simplex Method provides a unified approach to solve the three problems mentioned above and present experimental results. Our implementations have been on the BBN Butterfly machine. We draw attention to certain rather unusual results and argue that if the MNDS method is used then integrated layout compaction and wire balancing will achieve minimum chip width and a total wire length close to the optimum achieved by the wire balancing algorithm.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131752602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Katahira, Takehito Sasaki, Hong Shen, Hiroaki Kobayashi, Tadao Nakamura
{"title":"Software pipelining for Jetpipeline architecture","authors":"M. Katahira, Takehito Sasaki, Hong Shen, Hiroaki Kobayashi, Tadao Nakamura","doi":"10.1109/ISPAN.1994.367155","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367155","url":null,"abstract":"High performance processors based on pipeline processing play an important role in scientific computation. We have proposed a hybrid pipeline architecture named Jetpipeline in our former work. The concept of Jetpipeline comes from the integration of superscalar, VLIW and vector architectures. Jetpipeline has multiple instruction pipelines, which execute multiple instructions like superscalar architectures. Instructions to be executed simultaneously are statically scheduled by a compiler like VLIW architectures. Therefore, parallelism derivation and instruction scheduling are very important for Jetpipeline. Software pipelining is one of the well-known techniques to achieve high throughput when processing loop programs. In this paper, we propose software pipelining for Jetpipeline. Firstly, the overview of the Jetpipeline architecture is described. Then the banked register configuration of Jetpipeline for reducing hardware complexity and supporting software pipelining is presented. Finally, the effectiveness of software pipelining for Jetpipeline is discussed by simulation.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129016335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quicksort and permutation routing on the hypercube and de Bruijn networks","authors":"David S. L. Wei","doi":"10.1109/ISPAN.1994.367189","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367189","url":null,"abstract":"We consider the problems of sorting and routing on some unbounded interconnection networks, namely hypercube and de Bruijn network. We first present two efficient implementations of quicksort on the hypercube. The first algorithm sorts N items on an N-node hypercube, one item per node, in O((log/sup 2/ N)/(log log N)) time with high probability, while the other one sorts N items on an (N/log N)-node hypercube, log N items per node, in O(log/sup 2/ N) time with high probability, which achieves optimal speedup in the sense of PT product. Both algorithms beat the fastest previous quicksort that runs in O(log/sup 2/ N) expected time on a butterfly of N nodes. We also present a deterministic (nonoblivious) permutation routing algorithm which runs in O(d/spl middot/n/sup 2/) time on a d-ary de Bruijn network of N=d/sup n/ nodes. To the best of our knowledge, this routing algorithm is so far the fastest deterministic one for the de Bruijn network of arbitrary degree. The best previous one runs in O((log d)/spl middot/d/spl middot/n/sup 2/) time. All algorithms presented are simple, the constants hidden behind the big Oh being small.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134607107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Views of mixed-mode computing and network evaluation","authors":"H. Siegel, J. Antonio","doi":"10.1109/ISPAN.1994.367171","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367171","url":null,"abstract":"Trade-offs between the SIMD and MIMD models of architecture for parallelism are presented. Mixed-mode parallelism, where a machine can switch between the SIMD and MIMD modes of parallelism at instruction-level granularity with generally negligible overhead, is discussed. Advantages and disadvantages of mixed-mode parallelism and an example of a mixed-mode parallel algorithm are given. The relationship of mixed-mode processing to high-performance heterogeneous computing is overviewed. Difficulties involved with evaluating interconnection networks for parallel machines are then considered. There are a myriad of metrics that have been used in the literature. The problems involved with choosing the most appropriate metric or weighted set of metrics, and performing \"fair\" comparisons, are explored.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131513435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Increase analysis in the total execution time of a parallel program","authors":"Dingchao Li, Hiromitsu Takagi, N. Ishii","doi":"10.1109/ISPAN.1994.367175","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367175","url":null,"abstract":"Lower bound on the finishing time of optimal schedules is used as an absolute performance measure of static scheduling heuristics. This paper presents an efficient method of computing such a bound based on estimating overlaps among the execution ranges of tasks in a given task graph and analyzing the delays of tasks on the critical paths of the graph. The computation performed by this method is shown to be of higher quality than that of other known methods. The future work and directions on this topic are also indicated.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"34 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132440494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Organization of a parallel virtual machine","authors":"V. Beletsky, T. Popova, A. Chemeris","doi":"10.1109/ISPAN.1994.367137","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367137","url":null,"abstract":"A virtual parallel machine is presented. The virtual machine includes the following programs: loop parallelization, dependence graph building, scheduling job programs, compiler and simulating programs. The basic principles and ideas, on which the programs were realized, are expounded. It is enumerated the basic opportunities and advantages of the virtual machine. The results of simulation on the virtual machine are presented.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124169557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A systolic array implementation of common factor algorithm to compute DFT","authors":"S. He, M. Torkelson","doi":"10.1109/ISPAN.1994.367177","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367177","url":null,"abstract":"An extension to the common factor algorithm, CFA, to compute discrete Fourier transform, DFT, under the condition that the site of the transform is N=M/sup 2/, shows that the input and output data array of the transform may have identical index mapping. A simple planar 2-dimensional systolic array implementation of CFA algorithm is presented. The systolic array consists of N homogeneous processing element, PE. A DFT of size N=M/sup 2/ can be computed in 2M+1 steps of pipelined operations, achieving the area-time complexity AT/sup 2/=O(N/sup 2/log/sup 3/N). Asymptotically sub-optimal and without the necessity of complicated index mapping and data shuffling, the proposed approach is compared favorably with other existing approaches in realistic VLSI implementation. This architecture has also very good expansibility that a 2/sup t/N-size DFT transform can be computed on 2/sup t/ nearest-neighbor connected N-size array with reloaded twiddle factors, which makes it more suitable for VLSI implementation of DFT transform in various practical size.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"229 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127216239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}