Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)最新文献_第6页

Dynamic load balancing schemes for computing accessible surface area of protein molecules 计算蛋白质分子可达表面积的动态负载平衡方案

Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238) Pub Date : 1998-12-17 DOI: 10.1109/HIPC.1998.738005

E. Suh, B. Narahari, R. Simha

引用次数: 7

Precise control of instruction caches 指令缓存的精确控制

Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238) Pub Date : 1998-12-17 DOI: 10.1109/HIPC.1998.737965

Maria Smirli, D. Lioupis, K. Kissell

引用次数: 0

One to all broadcast in hyper butterfly networks 逐一在超级蝴蝶网播出

Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238) Pub Date : 1998-12-17 DOI: 10.1109/HIPC.1998.737984

Wei Shi, P. Srimani

引用次数: 2

Performance analysis of wavelength converters in WDM wavelength routed optical networks WDM波长路由光网络中波长转换器的性能分析

Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238) Pub Date : 1998-12-17 DOI: 10.1109/HIPC.1998.737994

K. Venugopal, E. E. Rajan, P. S. Kumar

引用次数: 24

On topology and bisection bandwidth of hierarchical-ring networks for shared-memory multiprocessors 共享内存多处理机分层环网络的拓扑结构和等分带宽

Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238) Pub Date : 1998-12-17 DOI: 10.1109/HIPC.1998.737997

G. Ravindran, M. Stumm

引用次数: 18

Virtual channel multiplexing in networks of workstations with irregular topology 不规则拓扑工作站网络中的虚拟信道复用

Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238) Pub Date : 1998-12-17 DOI: 10.1109/HIPC.1998.737983

F. Silla, J. Duato, A. Sivasubramaniam, C. Das

{"title":"Virtual channel multiplexing in networks of workstations with irregular topology","authors":"F. Silla, J. Duato, A. Sivasubramaniam, C. Das","doi":"10.1109/HIPC.1998.737983","DOIUrl":"https://doi.org/10.1109/HIPC.1998.737983","url":null,"abstract":"Networks of workstations are becoming a cost-effective alternative for small-scale parallel computing. Although they may not provide the closely coupled environment of multicomputers and multiprocessors, they meet the needs of a great variety of parallel computing problems at a lower cost. However in order to achieve a high efficiency, the interconnects used to build the network of workstations must provide a very high bandwidth and low latencies, making their design a critical issue. Recently, a very efficient flow control protocol for networks of workstations has been proposed by the authors. This protocol multiplexes physical channels between several virtual channels and minimizes the use of control flits by transmitting several data flits each time a virtual channel gets the link. In this protocol, a virtual channel sends data flits until the message blocks or is completely transmitted. However it can reduce network throughput, by increasing short message latency, due to long messages monopolizing channels and hindering the progress of short messages. In this paper, we analyze the impact of limiting the number of flits (block size) that a virtual channel can send once it gets the link. We propose a new version of the previous flow control protocol that is easily, implementable on hardware. Simulation results show that limiting the maximum block size is not a good design decision, because the overall network performance decreases. Only when short message latency is crucial is it is acceptable to limit the block size.","PeriodicalId":175528,"journal":{"name":"Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131263996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

Data structure distribution and multi-threading of Linux file system for multiprocessors 面向多处理器的Linux文件系统的数据结构分布与多线程

Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238) Pub Date : 1998-12-17 DOI: 10.1109/HIPC.1998.737976

Anish Sheth, K. Gopinath

引用次数: 2

Extended collective I/O for efficient retrieval of large objects 扩展的集合I/O，可以有效地检索大型对象

Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238) Pub Date : 1998-12-17 DOI: 10.1109/HIPC.1998.738009

S. More, A. Choudhary

引用次数: 1

Permutation admissibility in shuffle-exchange networks with arbitrary number of stages 任意阶数洗牌交换网络的置换可容许性

Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238) Pub Date : 1998-12-17 DOI: 10.1109/HIPC.1998.737998

Nabanita Das, B. Bhattacharya, R. Menon, S. Bezrukov

{"title":"Permutation admissibility in shuffle-exchange networks with arbitrary number of stages","authors":"Nabanita Das, B. Bhattacharya, R. Menon, S. Bezrukov","doi":"10.1109/HIPC.1998.737998","DOIUrl":"https://doi.org/10.1109/HIPC.1998.737998","url":null,"abstract":"The set of input-output permutations that are routable through a multistage interconnection network without any conflict (known as the admissible set), plays an important role in determining the capability of the network. Recent works on the permutation admissibility problem of shuffle-exchange networks (SEN) of size N/spl times/N, deal with (n+k) stages, where n=log/sub 2/N, and k denotes the number of extra stages. For k=0 or 1, O(Nn) algorithms exist to check if any permutation is admissible, but for k/spl ges/2, a polynomial time solution is not yet known. The more general problem of finding the minimum number (m) of shuffle-exchange stages required to realize an arbitrary permutation, 1/spl les/m/spl les/2n-1, is also an open problem. In this paper, we present an O(Nn) algorithm that checks whether a given permutation P is admissible in an m stage SEN, 1/spl les/m/spl les/n, and determines in O(Nnlogn) time the minimum number of stages m of shuffle-exchange, required to realize P. Thus, a single-stage shuffle-exchange network will be able to realize such a permutation with m passes, by recirculating all the paths m times through a single-stage, i.e., with minimum transmission delay, which, otherwise cannot be achieved with a fixed-stage SEN. Furthermore, we present a necessary condition for permutation admissibility in an m stage SEN, where n","PeriodicalId":175528,"journal":{"name":"Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)","volume":"205 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116381628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Data prefetching with co-operative caching 使用协同缓存的数据预取

Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238) Pub Date : 1998-12-17 DOI: 10.1109/HIPC.1998.737967

Chi-Hung Chi, Siu-Chung Lau

{"title":"Data prefetching with co-operative caching","authors":"Chi-Hung Chi, Siu-Chung Lau","doi":"10.1109/HIPC.1998.737967","DOIUrl":"https://doi.org/10.1109/HIPC.1998.737967","url":null,"abstract":"Recent research in data cache prefetching is found to be selective in nature: achieving high prediction accuracy over a set of selected references such as array access with constant strides. As a result, for applications where the memory latency is mainly due to data accesses in the set of non selected references of a program, they lose their effectiveness. In fact, their performance might be worse than that of the traditional, less accurate prefetch-on-miss scheme. To overcome this situation, we propose three cooperative cache techniques to assist data prefetching. They are: [1] default prefetching to increase the overall prefetch coverage; [2] block concept to perform variable distance lookahead prefetching; and [3] a spatial data buffer with load balancing to reduce the interference between spatial data and temporal data. To illustrate the potentials of these techniques, they were implemented on top of our previously proposed Instruction Opcode-Based Prefetching (IOBP) scheme (T.F. Chen, 1993). Trace driven simulation on SPEC92 showed that a 8 Kbytes data cache with a 512 bytes spatial buffer can achieve similar performance as a 32 Kbytes data cache through these techniques.","PeriodicalId":175528,"journal":{"name":"Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121948011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11