{"title":"A Class of Static and Dynamic Hierarchical Interconnection Networks","authors":"P. T. Breznay, Mario A. López","doi":"10.1109/ICPP.1994.15","DOIUrl":"https://doi.org/10.1109/ICPP.1994.15","url":null,"abstract":"A TCN is a hierarchical interconnection network where isomorphic clusters are connected using a complete graph at the highest level of hierarchy. We extend the concept of TCN to the dynamic domain and reduce the hardware complexity in the static domain. The resulting networks outperform both its non-hierarchical and hierarchical counterparts, while improving on the congestion and fault-tolerance characteristics of the latter; they also have optimal connectivity, high bisection width, low degree, cost and diameter, and low average distance under both uniform and non-uniform traffic. For instance, with dynamic clusters we obtain the same delay as the corresponding non hierarchical multistage network at 1/2 the cost; while the maximum and average delays are typically 2/3 and 3/4 of those in a comparable HMIN, at approximately the same cost. In the static case, using hypercube clusters, the degree, diameter and cost are approximately 1/2, 3/4 and 3/8 of the same parameters in a comparable size hypercube; while the diameter and average distance are typically 1/2 and 3/4 of those in traditional HINs. In case of mesh clusters, the diameter and cost are reduced even further, by an amount proportional to their square root.","PeriodicalId":217179,"journal":{"name":"1994 International Conference on Parallel Processing Vol. 1","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125923669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Masashi Sasahara, Junya Terada, Luo-Qun Zhou, Kalidou Gaye, J. Yamato, S. Ogura, H. Amano
{"title":"SNAIL: A Multiprocessor Based on the Simple Serial Synchronized Multistage Interconnection Network Architecture","authors":"Masashi Sasahara, Junya Terada, Luo-Qun Zhou, Kalidou Gaye, J. Yamato, S. Ogura, H. Amano","doi":"10.1109/ICPP.1994.182","DOIUrl":"https://doi.org/10.1109/ICPP.1994.182","url":null,"abstract":"Simple Serial Synchronized (SSS) Multistage Interconnection Network (MIN) is a novel MIN architecture for connecting processors and memory modules in multiprocessors. Synchronized bit-serial communication simplifies the structure/control, and also solves the pin-limitation problem. Here, design, implementation, and evaluation of a multiprocessor prototype called SNAIL with the SSS-MIN are presented. The heart of SNAIL is the prototype 1 /mu CMOS SSS-MIN gate array chip which exchanges packets from 16 inputs with 50MHz clock. The message combining is implemented only with 20% increases of the hardware. From the empirical evaluation with some application programs, it appears that the latency and synchronization overhead of the SSSMIN are tolerable, and the bandwidth of the SSS-MIN is sufficient. Although the performance improvement with the bit serial message combine is not so large (1%) when instructions are stored in the local memory, it becomes up to 400% when instructions are stored in the shared memory.","PeriodicalId":217179,"journal":{"name":"1994 International Conference on Parallel Processing Vol. 1","volume":"177 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116577788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Compiler Optimization Technique for Data Cache Prefetching Using a Small CAM Array","authors":"Chi-Hung Chi","doi":"10.1109/ICPP.1994.71","DOIUrl":"https://doi.org/10.1109/ICPP.1994.71","url":null,"abstract":"With advances in compiler optimization and program flow analysis, software assisted cache prefetching schemes using PREFETCH instructions are now possible. Although data can be prefetched accurately into the cache, the runtime overhead associated with these schemes often limits their practical use. In this paper, we propose a new scheme, called the Stride_CAM Data Prefetching (SCP), to prefetch array references with constant strides accurately. Compared to current software assisted data prefetching schemes, the SCP scheme has much lower runtime overhead without sacrificing prefetching accuracy. Our result showed that the SCP scheme is particularly suitable for computing intensive scientific applications where cache misses are mainly due to array references with constant strides and they can be prefetched very accurately by this SCP scheme.","PeriodicalId":217179,"journal":{"name":"1994 International Conference on Parallel Processing Vol. 1","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114675073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A New Tag Scheme and Its Tree Representation for a Shuffle-Exchange Network","authors":"","doi":"10.1109/ICPP.1994.38","DOIUrl":"https://doi.org/10.1109/ICPP.1994.38","url":null,"abstract":"This paper introduces a new tag scheme having a flexibility and a bidirectionality for an N ¿ N k-stage shuffle-exchange network where k ge log_2 N. Based on the interstage correlations for a path, the EB-tag generating rules are presented. In addition, an elegant and simple graphical representation called a N-leaf Dual Complete Binary tree (N-leaf DCB-tree) for a given permutation is proposed and its characteristics are discussed. Then the universal necessary condition for a conflict-free connection pattern is given. Due to this condition, the conflict-free routing problem can be transformed to the perfect assignment problem.","PeriodicalId":217179,"journal":{"name":"1994 International Conference on Parallel Processing Vol. 1","volume":"193 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132156511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A General Inside-Out Routing Algorithm for a Class of Rearrangeable Networks","authors":"S. Seo, T. Feng","doi":"10.1109/ICPP.1994.29","DOIUrl":"https://doi.org/10.1109/ICPP.1994.29","url":null,"abstract":"In this paper, we present a generalized version of the routing algorithm[1] for a class of 2log_2 N-stage networks which are made by concatenating two log_2 Nstage blocking networks. We show that the generalized algorithm can also cover a class of(2log_2 N - 1)-stage networks. It is shown that the inside-out algorithm is a more general algorithm which covers a large class of inherently symmetric rearrangeable networks, including the Benes and its equivalent networks. Moreover, it is shown that the time complexity of the algorithm is in O(N), which is superior to that of the looping algorithm. The algorithm is discussed using a graph representation of the network and its connectivity properties are shown by a graph describing rule. To show that the algorithm covers a class of 2log_2 N-stage networks, we introduce the concept of a base-network. These base-networks satisfy some common connectivity properties, and we show that any concatenation of two base-networks can be routed by our new algorithm.","PeriodicalId":217179,"journal":{"name":"1994 International Conference on Parallel Processing Vol. 1","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133458541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparison of Mesh and Hierarchical Networks for Multiprocessors","authors":"V. Hamacher, Hong Jiang","doi":"10.1109/ICPP.1994.69","DOIUrl":"https://doi.org/10.1109/ICPP.1994.69","url":null,"abstract":"Upper bounds on message delay and throughput are developed for two networks that have been used in recent multiprocessor systems. Two-dimensional mesh networks with bidirectional links and no end-around connections are compared to bus-type hierarchical networks that use segmented rings for the interconnection paths at each level of the hierarchy. Wormhole routing of short, fixed-length messages is used in the mesh networks, while a complete message can be switched between ring segments in one switch time in the hierarchical networks. It is found that three-level hierarchical systems perform somewhat better than mesh systems with respect to the basic bounds criteria that are developed.","PeriodicalId":217179,"journal":{"name":"1994 International Conference on Parallel Processing Vol. 1","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114304806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}