Proceedings of the 1997 International Conference on Parallel Processing (Cat. No.97TB100162)最新文献

筛选
英文 中文
Automatic partitioning of data and computations on scalable shared memory multiprocessors 在可扩展的共享内存多处理器上对数据和计算进行自动分区
S. Tandri, T. Abdelrahman
{"title":"Automatic partitioning of data and computations on scalable shared memory multiprocessors","authors":"S. Tandri, T. Abdelrahman","doi":"10.1109/ICPP.1997.622557","DOIUrl":"https://doi.org/10.1109/ICPP.1997.622557","url":null,"abstract":"This paper describes an algorithm for deriving data and computation partitions on scalable shared memory multiprocessors. The algorithm establishes affinity relationships between where computations are performed and where data is located based on array accesses in the program. The algorithm then uses these affinity relationships to determine both static and dynamic partitions for arrays and parallel loops. Experimental results from a prototype implementation of the algorithm demonstrate that it is computationally efficient and that it improves the parallel performance of standard benchmarks. The results also show the necessity of taking shared memory effects (memory contention, cache locality, false-sharing and synchronization) into account-partitions derived to minimize only interprocessor communications do not necessarily result in the best performance.","PeriodicalId":221761,"journal":{"name":"Proceedings of the 1997 International Conference on Parallel Processing (Cat. No.97TB100162)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127735967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
Good processor management=fast allocation+efficient scheduling 好的处理器管理=快速分配+高效调度
B. S. Yoo, C. Das
{"title":"Good processor management=fast allocation+efficient scheduling","authors":"B. S. Yoo, C. Das","doi":"10.1109/ICPP.1997.622656","DOIUrl":"https://doi.org/10.1109/ICPP.1997.622656","url":null,"abstract":"Fast and efficient processor allocation and job scheduling algorithms are essential components of a multi-user multicomputer operating system. In this paper we propose two novel processor management schemes which meet such demands for mesh-connected multicomputers. A stack-based allocation algorithm that can locate a free sub-mesh for a job very quickly using simple coordinate calculation and spatial subtraction is proposed. Simulation results show that the stack-based allocation algorithm outperforms all the existing allocation policies in terms of allocation overhead while delivering competitive performance. Another technique, called group scheduling, schedules jobs in such a way that the jobs belonging to the same group do not block each other. The groups are scheduled in an FCFS order to prevent starvation. This simple but efficient scheduling policy reduces the response rime significantly by minimizing the queueing delay for the jobs in the same group. These two schemes, when used together can provide faster service to users with very little overhead.","PeriodicalId":221761,"journal":{"name":"Proceedings of the 1997 International Conference on Parallel Processing (Cat. No.97TB100162)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134195668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Software-based deadlock recovery technique for true fully adaptive routing in wormhole networks 虫洞网络中真正完全自适应路由的基于软件的死锁恢复技术
Juan-Miguel Martínez, P. López, J. Duato, T. Pinkston
{"title":"Software-based deadlock recovery technique for true fully adaptive routing in wormhole networks","authors":"Juan-Miguel Martínez, P. López, J. Duato, T. Pinkston","doi":"10.1109/ICPP.1997.622586","DOIUrl":"https://doi.org/10.1109/ICPP.1997.622586","url":null,"abstract":"In this paper, we take a different approach to handle deadlocks and performance degradation. We propose the use of an injection limitation mechanism that prevents performance degradation near the saturation point and reduces the probability of deadlock to negligible values even when fully adaptive routing is used. We also propose an improved deadlock detection mechanism that only uses local information, detects all the deadlocks, and considerably reduces the probability of false deadlock detection over previous proposals. In the rare case when impending deadlock is detected, our proposed recovery technique absorbs the deadlocked message at the current node and later re-injects it for continued routing towards its destination. Performance evaluation results show that our new approach to deadlock handling is more efficient than previously proposed techniques.","PeriodicalId":221761,"journal":{"name":"Proceedings of the 1997 International Conference on Parallel Processing (Cat. No.97TB100162)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128619700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 66
Efficient processor allocation scheme for multi dimensional interconnection networks 多维互联网络的高效处理器分配方案
Hyunseung Choo, H. Youn, G. Park, B. Shirazi
{"title":"Efficient processor allocation scheme for multi dimensional interconnection networks","authors":"Hyunseung Choo, H. Youn, G. Park, B. Shirazi","doi":"10.1109/ICPP.1997.622570","DOIUrl":"https://doi.org/10.1109/ICPP.1997.622570","url":null,"abstract":"The task scheduling policy and the processor allocation scheme affect the system performance significantly. In this paper, we propose an efficient processor allocation scheme for 3D mesh interconnection network with a simple FIFO scheduling policy. Complexity analysis shows that the allocation and deallocation of the scheme are O(LWH/sup 2/) and O(LH), respectively, which are better than earlier schemes. Comprehensive computer simulation shows that the average allocation time of the proposed scheme is improved up to about 85% compared to the best earlier 3D approach.","PeriodicalId":221761,"journal":{"name":"Proceedings of the 1997 International Conference on Parallel Processing (Cat. No.97TB100162)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127008104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An adaptive sequential prefetching scheme in shared-memory multiprocessors 共享内存多处理器中的自适应顺序预取方案
Myoung Kwon Tcheun, H. Yoon, S. Maeng
{"title":"An adaptive sequential prefetching scheme in shared-memory multiprocessors","authors":"Myoung Kwon Tcheun, H. Yoon, S. Maeng","doi":"10.1109/ICPP.1997.622660","DOIUrl":"https://doi.org/10.1109/ICPP.1997.622660","url":null,"abstract":"The sequential prefetching scheme is a simple hardware controlled scheme, which exploits the sequentiality of memory accesses to predict which blocks will be read in the near future. We analyze the relationship between the sequentiality of application programs and the effectiveness of sequential prefetching on shared-memory multiprocessors. Also, we propose a simple hardware scheme which selects the prefetching degree on each miss by adding a small table (PDS: Prefetching Degree Selector) to the sequential prefetching scheme. This scheme could prefetch consecutive blocks aggressively for applications with high sequentiality and conservatively for applications with low sequentiality.","PeriodicalId":221761,"journal":{"name":"Proceedings of the 1997 International Conference on Parallel Processing (Cat. No.97TB100162)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127831603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
A global computing environment for networked resources 网络资源的全球计算环境
H. Topcuoglu, S. Hariri
{"title":"A global computing environment for networked resources","authors":"H. Topcuoglu, S. Hariri","doi":"10.1109/ICPP.1997.622686","DOIUrl":"https://doi.org/10.1109/ICPP.1997.622686","url":null,"abstract":"Current advances in high-speed networks and WWW technologies have made network computing a cost-effective, high-performance computing alternative. New software tools are being developed to utilize efficiently the network computing environment. Our project, called Virtual Distributed Computing Environment (VDCE), is a high-performance computing environment that allows users to write and evaluate networked applications for different hardware and software configurations using a web interface. In this paper we present the software architecture of VDCE by emphasizing application development and specification, scheduling, and execution/runtime aspects.","PeriodicalId":221761,"journal":{"name":"Proceedings of the 1997 International Conference on Parallel Processing (Cat. No.97TB100162)","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125086914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Load balancing and work load minimization of overlapping parallel tasks 重叠并行任务的负载平衡与工作负载最小化
V. Krishnaswamy, Gagan Hasteer, P. Banerjee
{"title":"Load balancing and work load minimization of overlapping parallel tasks","authors":"V. Krishnaswamy, Gagan Hasteer, P. Banerjee","doi":"10.1109/ICPP.1997.622655","DOIUrl":"https://doi.org/10.1109/ICPP.1997.622655","url":null,"abstract":"In this paper, we propose a unique problem in the assignment of overlapping tasks to processors on a parallel machine, with the twin objectives of minimizing workloads while maintaining good load balance. This problem arises in some applications in VLSI CAD, e.g. parallel compiled VHDL simulation. We assume that the parallel application can be decomposed into a set of tasks, each in turn comprising a finite number of subtasks. Overlapped computations arise as a result of replication of subtasks across tasks in order to reduce the amount of communication performed in fine grained parallel applications. The uniqueness of the problem stems from the fact that overlapping computation on tasks assigned to the same processor is only performed once. Theoretical results on NP-hardness and bounds on the utilization of overlap are provided. A heuristic solution is also proposed. An important application area in VLSI-CAD, parallel compiled event driven VHDL simulation is introduced. Results of the application of our heuristics to this problem are reported on a SUN Sparcserver 1000 multiprocessor.","PeriodicalId":221761,"journal":{"name":"Proceedings of the 1997 International Conference on Parallel Processing (Cat. No.97TB100162)","volume":"39 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116538230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Parallel synchronization of continuous time discrete event simulators 连续时间离散事件模拟器的并行同步
Peter Frey, H. Carter, P. Wilsey
{"title":"Parallel synchronization of continuous time discrete event simulators","authors":"Peter Frey, H. Carter, P. Wilsey","doi":"10.1109/ICPP.1997.622649","DOIUrl":"https://doi.org/10.1109/ICPP.1997.622649","url":null,"abstract":"Mixed-Mode simulation has been generating considerable interest in the simulation community and has continued to grow as an active research area. Traditional mixed-mode simulation involves the merging of digital and analog simulators in various ways. However, efficient methods for the synchronization between the two time domains remains elusive. This is due to the fact that the analog simulator uses dynamic time step control whereas the digital simulator uses the event driven paradigm. This paper proposes two new synchronization methods and presents their capabilities using a component-based continuous time simulator integrated with an optimistic parallel discrete event simulator. The results of the performance evaluation leads us to believe that while both synchronization methods are functionally viable, one has superior performance.","PeriodicalId":221761,"journal":{"name":"Proceedings of the 1997 International Conference on Parallel Processing (Cat. No.97TB100162)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131235691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Performance evaluation of fault tolerance for parallel applications in networked environments 网络环境下并行应用容错性能评价
Pierre Sens
{"title":"Performance evaluation of fault tolerance for parallel applications in networked environments","authors":"Pierre Sens","doi":"10.1109/ICPP.1997.622663","DOIUrl":"https://doi.org/10.1109/ICPP.1997.622663","url":null,"abstract":"This paper presents the performance evaluation of a software fault manager for distributed applications. Dubbed STAR, it uses the natural redundancy existing in networks of workstations to offer a high level of fault tolerance. Fault management is transparent to the supported parallel applications. STAR is application independent, highly configurable and easily portable to UNIX-like operating systems. The current implementation is based on independent checkpointing and message logging. Measurements show the efficiency and the limits of this implementation. The challenge is to show that a software approach to fault tolerance can efficiently be implemented in a standard networked environment.","PeriodicalId":221761,"journal":{"name":"Proceedings of the 1997 International Conference on Parallel Processing (Cat. No.97TB100162)","volume":"1823 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129752660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Automatic generation of injective modular mappings 自动生成注入模映射
Hyuk-Jae Lee, J. Fortes
{"title":"Automatic generation of injective modular mappings","authors":"Hyuk-Jae Lee, J. Fortes","doi":"10.1109/ICPP.1997.622675","DOIUrl":"https://doi.org/10.1109/ICPP.1997.622675","url":null,"abstract":"Many optimizations (of programs with loops) used in parallelizing compilers and systolic array design are based on linear transformations of loop iteration spaces. Additional important optimizations and designs are possible by using recently proposed modular mappings, which are described by linear transformations modulo a constant vector. Previous work on modular mappings focused an conditions that guarantee injectivity of a modular mapping for algorithms with rectangular index sets. This paper generalizes previous work by providing new injectivity conditions that cover the cases when the program index set has arbitrary shape and size, and the target processor array and the mapping moduli are of arbitrary size. A systematic technique to efficiently generate modular mappings is also proposed. The complexity of the proposed generation technique is O(n/sup 2/n!) for a nested loop of depth n with a rectangular index set and a target processor array with as many processors as required. A bounded search scheme is also provided for general cases. Each trial is formulated as an integer linear programming problem with at most 3n variables.","PeriodicalId":221761,"journal":{"name":"Proceedings of the 1997 International Conference on Parallel Processing (Cat. No.97TB100162)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133396901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信