Proceedings 11th International Parallel Processing Symposium最新文献

筛选
英文 中文
Maximum delivery time and hot spots in ServerNet/sup TM/ topologies 最大交付时间和ServerNet/sup TM/拓扑中的热点
Proceedings 11th International Parallel Processing Symposium Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580993
D. Avresky, V. Shurbanov, R. Horst, W. Watson, L. Young, D. Jewett
{"title":"Maximum delivery time and hot spots in ServerNet/sup TM/ topologies","authors":"D. Avresky, V. Shurbanov, R. Horst, W. Watson, L. Young, D. Jewett","doi":"10.1109/IPPS.1997.580993","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580993","url":null,"abstract":"This paper centers on analysis of the performance characteristics of ServerNet topologies, concentrating on the prediction through simulation and statistical analysis of the maximum two-way delivery time, the identification of congested links (hot spots) and tree saturation. ServerNet/sup TM/, developed by Tandem Computers Inc., is a wormhole-routed, packet-switched, point-to-point network, with special attention paid to reducing latency and to assuring reliability. ServerNet uses multiple high-speed, low-cost routers to rapidly switch data directly between data sources and destinations. Our study is based on data generated by a simulation tool. Statistical analysis and inference methods were used to process the samples generated by the simulator and to obtain estimates for the maximum two-way packet delivery time. Link usage statistics were recorded by the simulator for the purpose of performing a detailed investigation of congestion effects and hot spots. Hot spots may cause the occurrence of tree saturation in the network i.e., where an individual tree will become congested (tree saturation) while all other trees are mostly idle, which leads to significant performance degradation.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124002354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
An accurate model for the performance analysis of deterministic wormhole routing 确定性虫洞路由性能分析的精确模型
Proceedings 11th International Parallel Processing Symposium Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580926
B. Ciciani, C. Paolucci, M. Colajanni
{"title":"An accurate model for the performance analysis of deterministic wormhole routing","authors":"B. Ciciani, C. Paolucci, M. Colajanni","doi":"10.1109/IPPS.1997.580926","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580926","url":null,"abstract":"Presents a new analytical approach for the performance evaluation of asynchronous wormhole routing in k-ary n-cubes. Through the analysis of network flows, our methodology furnishes a closed formula for the average message delay in wormhole deterministic routing. In this paper, the focus is on 3D asymmetric torus networks with uni-directional or bi-directional links. However, the model can be easily applied to evaluate the performance of deterministic wormhole policies in any hypercube and torus topology. The comparison with two simulation models demonstrates that our methodology gives accurate results for both low and high traffics.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116953766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
Cyclic networks: A family of versatile fixed-degree interconnection architectures 循环网络:一组通用的固定度互连体系结构
Proceedings 11th International Parallel Processing Symposium Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580990
C. Yeh, B. Parhami
{"title":"Cyclic networks: A family of versatile fixed-degree interconnection architectures","authors":"C. Yeh, B. Parhami","doi":"10.1109/IPPS.1997.580990","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580990","url":null,"abstract":"In this paper, we propose a new family of interconnection networks, called cyclic networks (CNs), in which an intercluster connection is defined on a set of nodes whose addresses are cyclic shifts of one another. The node degrees of basic CNs are independent of system size, but can vary from a small constant (e.g., 3) to as large as required, thus providing flexibility and effective tradeoff between cost and performance. The diameters of suitably constructed CNs can be asymptotically optimal within their lower bounds, given the degrees. We show that packet routing and ascend/descend algorithms can be performed in /spl Theta/(log/sub d/ N) communication steps on some CNs with N nodes of degree /spl Theta/(d). Moreover CNs can also efficiently emulate homogeneous product networks (e.g., hypercubes and high dimensional meshes). As a consequence, we obtain a variety of efficient algorithms on such networks, thus proving the versatility of CNs.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123005976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
An efficient parallel algorithm for solving the knapsack problem on the hypercube 求解超立方体上背包问题的一种高效并行算法
Proceedings 11th International Parallel Processing Symposium Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580964
A. Goldman, D. Trystram
{"title":"An efficient parallel algorithm for solving the knapsack problem on the hypercube","authors":"A. Goldman, D. Trystram","doi":"10.1109/IPPS.1997.580964","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580964","url":null,"abstract":"The authors present a new algorithm to solve the integral knapsack problem on the hypercube. The main idea is to use the fact that the precedence graph of the dynamic programming function of the knapsack problem is an irregular mesh. They propose a scheduling algorithm for irregular meshes on the hypercube. The efficiency of the algorithm is independent on the number of processors. They also present some improvements for the solution of the 0/1 knapsack problem on the hypercube.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123194249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Relative performance of preemption-safe locking and non-blocking synchronization on multiprogrammed shared memory multiprocessors 多程序共享内存多处理器上抢占安全锁和非阻塞同步的相对性能
Proceedings 11th International Parallel Processing Symposium Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580906
Maged M. Michael, M. Scott
{"title":"Relative performance of preemption-safe locking and non-blocking synchronization on multiprogrammed shared memory multiprocessors","authors":"Maged M. Michael, M. Scott","doi":"10.1109/IPPS.1997.580906","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580906","url":null,"abstract":"Most multiprocessors are multiprogrammed to achieve acceptable response time. Unfortunately inopportune preemption may significantly degrade the performance of synchronized parallel applications. To address this problem, researchers have developed two principal strategies for concurrent, atomic update of shared data structures: (1) preemption safe locking and (2) non blocking (lock free) algorithms. Preemption safe locking requires kernel support. Non blocking algorithms generally require a universal atomic primitive, and are widely regarded as inefficient. We present a comparison of the two alternative strategies, focusing on four simple but important concurrent data structures-stacks, FIFO queues, priority queues and counters-in microbenchmarks and real applications on a 12 processor SGI Challenge multiprocessor. Our results indicate that data structure specific non blocking algorithms, which exist for stacks, FIFO queues and counters, can work extremely well: not only do they outperform preemption safe lock based algorithms on multiprogrammed machines, they also out perform ordinary locks on dedicated machines. At the same time, since general purpose nonblocking techniques do not yet appear to be practical, preemption safe locks remain the preferred alternative for complex data structures: they outperform conventional locks by significant margins on multiprogrammed systems.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125058938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Scalability of SCI workstation clusters, a preliminary study SCI工作站集群的可扩展性初步研究
Proceedings 11th International Parallel Processing Symposium Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580992
Knut Omang, B. Parady
{"title":"Scalability of SCI workstation clusters, a preliminary study","authors":"Knut Omang, B. Parady","doi":"10.1109/IPPS.1997.580992","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580992","url":null,"abstract":"SCI is based on unidirectional point-to-point links connected in rings. Separate rings can be connected using switches to allow for larger configurations than are feasible with a single ring. The SCI standard is implemented in the LinkController (LC-1) chip from Dolphin Interconnect Solutions. Experiments are conducted on a set of Sun UltraSparc workstations connected by the new Sbus/SCI adapters based on the LC-1. The practical limit on the number of nodes on a ring depends on the host interface and the link capacity and tuning, as well as the expected interconnect load. Single ring configurations of up to 10 nodes on a ring and 4 and 8 node systems using multiple smaller rings connected through a 4-way switch are analysed with respect to latency and network throughput. Results demonstrate a peak aggregate ring bandwidth of 187 Mbytes/s in a 10 node ring configuration, and a peak bandwidth of 153 Mbytes/s for the 4-way switched 8 node system. Results also show that the interconnect is sensitive to certain traffic patterns, and that different topologies have different weak spots in this sense.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129203311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
A comparison of general approaches to multiprocessor scheduling 多处理机调度的一般方法比较
Proceedings 11th International Parallel Processing Symposium Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580873
Jing-Chiou Liou, M. Palis
{"title":"A comparison of general approaches to multiprocessor scheduling","authors":"Jing-Chiou Liou, M. Palis","doi":"10.1109/IPPS.1997.580873","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580873","url":null,"abstract":"The paper demonstrates the effectiveness of the two phase method of scheduling, in which task clustering is performed prior to the actual scheduling process. Task clustering determines the optimal or near optimal number of processors on which to schedule the task graph. In other words, there is never a need to use more processors (even though they are available) than the number of clusters produced by the task clustering algorithm. The paper also indicates that when task clustering is performed prior to scheduling, load balancing (LB) is the preferred approach for cluster merging. LB is fast, easy to implement, and produces significantly better final schedules than communication traffic minimizing (CTM). In summary, the two phase method consisting of task clustering and load balancing is a simple, yet highly effective strategy for scheduling task graphs on distributed memory parallel architectures.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121405795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 91
Fault-tolerant deadline-monotonic algorithm for scheduling hard-real-time tasks 硬实时任务调度的容错截止单调算法
Proceedings 11th International Parallel Processing Symposium Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580868
A. Bertossi, Andrea Fusiello, L. Mancini
{"title":"Fault-tolerant deadline-monotonic algorithm for scheduling hard-real-time tasks","authors":"A. Bertossi, Andrea Fusiello, L. Mancini","doi":"10.1109/IPPS.1997.580868","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580868","url":null,"abstract":"The paper presents a fault tolerant scheduling algorithm for multiprocessor hard real time systems. The so called partitioning method is used to schedule a set of tasks in a multiprocessor system. Fault tolerance is achieved by using a combined duplication technique where each task scheduled on a processor has either an active or a passive copy scheduled on a different processor. Simulation experiments reveal a saving of processors with respect to those needed by the usual approach of duplicating the schedule of the non fault tolerant case.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"152 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116338083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Coarse grained parallel next element search 粗粒度并行下一个元素搜索
Proceedings 11th International Parallel Processing Symposium Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580919
Albert Chan, F. Dehne, A. Rau-Chaplin
{"title":"Coarse grained parallel next element search","authors":"Albert Chan, F. Dehne, A. Rau-Chaplin","doi":"10.1109/IPPS.1997.580919","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580919","url":null,"abstract":"The authors present a parallel algorithm for solving the next element search problem on a set of line segments, using a BSP like model referred to as the coarse grained multicomputer (CGM). The algorithm requires O(1) communication rounds (h-relations with h=O(n/p)), O((n/p) log n) local computation, and O((n/p) log n) storage per processor. The result implies solutions to the point location, trapezoidal decomposition and polygon triangulation problems. A simplified version for axis parallel segments requires only O(n/p) storage per processor, and they discuss an implementation of this version. As in a previous paper by Develliers and Fabri (1993), their algorithm is based on a distributed implementation of segment trees which are of size O(n log n). The paper improves on the work of Develliers and Fabri which presented a CGM algorithm for the special case of trapezoidal decomposition only and requires O((n/p)*log p*log n) local computation.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"243 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116046980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Accuracy and speed-up of parallel trace-driven architectural simulation 并行轨迹驱动建筑仿真的精度和加速
Proceedings 11th International Parallel Processing Symposium Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580842
A. Nguyen, P. Bose, K. Ekanadham, Ashwini K. Nanda, Maged M. Michael
{"title":"Accuracy and speed-up of parallel trace-driven architectural simulation","authors":"A. Nguyen, P. Bose, K. Ekanadham, Ashwini K. Nanda, Maged M. Michael","doi":"10.1109/IPPS.1997.580842","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580842","url":null,"abstract":"Trace-driven simulation continues to be one of the main evaluation methods in the design of high performance processor-memory sub-systems. In this paper, we examine the varying speed-up opportunities available by processing a given trace in parallel on an IBM SP-2 machine. We also develop a simple, yet effective method of correcting for cold-start cache miss errors, by the use of overlapped trace chunks. We then report selected experimental results to validate our expectations. We show that it is possible to achieve near-perfect speedup without loss of accuracy. Next, in order to achieve further reduction in simulation cost, we combine uniform sampling methods with parallel trace processing with a slight loss of accuracy for finite-cache timer runs. We then show that by using warm-start sequences from preceding trace chunks, it is possible to reduce the errors back to acceptable bounds.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132080083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 41
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信