Proceedings 11th International Parallel Processing Symposium最新文献

筛选
英文 中文
A compiler-directed cache coherence scheme using data prefetching 使用数据预取的编译器定向缓存一致性方案
Proceedings 11th International Parallel Processing Symposium Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580970
Hock-Beng Lim, P. Yew
{"title":"A compiler-directed cache coherence scheme using data prefetching","authors":"Hock-Beng Lim, P. Yew","doi":"10.1109/IPPS.1997.580970","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580970","url":null,"abstract":"Cache coherence enforcement and memory latency reduction and hiding are very important problems in the design of large-scale shared-memory multiprocessors. The authors propose a compiler-directed cache coherence scheme which makes use of data prefetching. The cache coherence with data prefetching (CCDP) scheme uses compiler analysis techniques to identify potentially-stale data references, which are references to invalid copies of cached data. The key idea of the CCDP scheme is to enforce cache coherence by prefetching the up-to-date data corresponding to these potentially-stale references from the main memory. Application case studies were conducted to gain a quantitative idea of the performance potential of the CCDP scheme on a real system. They applied the CCDP scheme on four benchmark programs from the SPEC CFP95 and CFP92 suites, and executed them on the Cray T3D. The experimental results show that for the programs studied, the scheme provides significant performance improvements by caching shared data and reducing the remote shared-memory access penalty incurred by the programs.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129655744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Broadcasting and multicasting in cut-through routed networks 直通路由网络中的广播和多播
Proceedings 11th International Parallel Processing Symposium Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580989
Johanne Cohen, P. Fraigniaud, J. König, A. Raspaud
{"title":"Broadcasting and multicasting in cut-through routed networks","authors":"Johanne Cohen, P. Fraigniaud, J. König, A. Raspaud","doi":"10.1109/IPPS.1997.580989","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580989","url":null,"abstract":"This paper addresses the one-to-all broadcasting problem, and the one-to-many broadcasting problem, usually simply called broadcasting and multicasting, respectively. In this paper, we study these problems under both line model, and cut-through model. The former assumes long distance calls between non neighboring processors. The latter completes the line model by taking into account the use of a routing function. It is known that one can find time optimal broadcast and multicast protocols in the line model in polynomial time. We present a new time optimal broadcasting and multicasting algorithm in the line model. This algorithm efficiently uses the bandwidth of the network. Moreover, it also applies to the cut-through model as soon as the routing function generates shortest paths only.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127547804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Optimizing parallel bitonic sort 优化并行双元排序
Proceedings 11th International Parallel Processing Symposium Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580914
M. Ionescu, K. Schauser
{"title":"Optimizing parallel bitonic sort","authors":"M. Ionescu, K. Schauser","doi":"10.1109/IPPS.1997.580914","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580914","url":null,"abstract":"Sorting is an important component of many applications, and parallel sorting algorithms have been studied extensively in the last three decades. One of the earliest parallel sorting algorithms is bitonic sort, which is represented by a sorting network consisting of multiple butterfly stages. The paper studies bitonic sort on modern parallel machines which are relatively coarse grained and consist of only a modest number of nodes, thus requiring the mapping of many data elements to each processor. Under such a setting optimizing the bitonic sort algorithm becomes a question of mapping the data elements to processing nodes (data layout) such that communication is minimized. The authors developed a bitonic sort algorithm which minimizes the number of communication steps and optimizes the local computation. The resulting algorithm is faster than previous implementations, as experimental results collected on a 64 node Meiko CS-2 show.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122350813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 57
Low latency MPI for Meiko CS/2 and ATM clusters 低延迟MPI为Meiko CS/2和ATM集群
Proceedings 11th International Parallel Processing Symposium Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580929
Chris R. Jones, Ambuj K. Singh, D. Agrawal
{"title":"Low latency MPI for Meiko CS/2 and ATM clusters","authors":"Chris R. Jones, Ambuj K. Singh, D. Agrawal","doi":"10.1109/IPPS.1997.580929","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580929","url":null,"abstract":"MPI (Message Passing Interface) is a proposed message-passing standard for the development of efficient and portable parallel programs. An implementation of MPI is presented and evaluated for the Meiko CS/2, a 64-node parallel computer, and a network of 8 SGI workstations connected by an ATM switch and an Ethernet.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"275 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133268019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Semantics and implementation of a generalized forall statement for parallel languages 并行语言通用forall语句的语义和实现
Proceedings 11th International Parallel Processing Symposium Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580953
P. Dechering, L. Breebaart, F. Kuijlman, K. V. Reeuwijk, H. Sips
{"title":"Semantics and implementation of a generalized forall statement for parallel languages","authors":"P. Dechering, L. Breebaart, F. Kuijlman, K. V. Reeuwijk, H. Sips","doi":"10.1109/IPPS.1997.580953","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580953","url":null,"abstract":"In this paper we present a generalized forall statement for parallel languages. The forall statement occurs in many (data) parallel languages and specifies which computations can be performed independently. Many different definitions of such a construct can be found in literature, with different conditions and execution models. We will show how forall constructs of a wide class of parallel languages can be mapped to this generalized forall statement. In addition, the forall statement we propose has the ability to spawn more complex independent activities than can be found in these languages. Denotational semantics are used to define the meaning of the forall and define only one possible program state change. It is shown that it is easy to use and that it is feasible to implement this forall efficiently.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131772276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Comparing gang scheduling with dynamic space sharing on symmetric multiprocessors using automatic self-allocating threads (ASAT) 基于自动自分配线程(ASAT)的对称多处理机动态空间共享与组调度的比较
Proceedings 11th International Parallel Processing Symposium Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580911
C. Severance, R. Enbody
{"title":"Comparing gang scheduling with dynamic space sharing on symmetric multiprocessors using automatic self-allocating threads (ASAT)","authors":"C. Severance, R. Enbody","doi":"10.1109/IPPS.1997.580911","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580911","url":null,"abstract":"The work considers the best way to handle a diverse mix of multi-threaded and single-threaded jobs running on a single symmetric parallel processing system. The traditional approaches to this problem are free scheduling, gang scheduling, or space sharing. The paper examines a less common technique called dynamic space sharing. One approach to dynamic space sharing, automatic self allocating threads (ASAT), is compared to all of the traditional approaches to scheduling a mixed load of jobs. Performance results for ASAT scheduling, gang scheduling, and free scheduling are presented. ASAT scheduling is shown to be the superior approach to mixing multi-threaded work with single threaded work.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"82 1-2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114121537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A randomized sorting algorithm on the BSP model 基于BSP模型的随机排序算法
Proceedings 11th International Parallel Processing Symposium Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580912
A. Gerbessiotis, Constantinos J. Siniolakis
{"title":"A randomized sorting algorithm on the BSP model","authors":"A. Gerbessiotis, Constantinos J. Siniolakis","doi":"10.1109/IPPS.1997.580912","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580912","url":null,"abstract":"The authors present a new randomized sorting algorithm on the bulk-synchronous parallel (BSP) model. The algorithm improves upon the parallel slack of previous algorithms to achieve optimality. Tighter probabilistic bounds are also established. It uses sample sorting and utilizes recently introduced search algorithms for a class of data structures on the BSP model. Moreover the methods are within a 1+o(1) multiplicative factor of the respective sequential methods in terms of speedup for a wide range of the BSP parameters.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116636538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
A tool for on-line visualization and interactive steering of parallel HPC applications 并行HPC应用程序的在线可视化和交互式转向工具
Proceedings 11th International Parallel Processing Symposium Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580882
S. Rathmayer
{"title":"A tool for on-line visualization and interactive steering of parallel HPC applications","authors":"S. Rathmayer","doi":"10.1109/IPPS.1997.580882","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580882","url":null,"abstract":"Tools for parallel systems today range from specification over debugging to performance analysis and more. Typically, they help the programmers of parallel algorithms from the early development stages to a certain level of program optimization. However in HPC (High Performance Computing) today the end-user of massively parallel CFD (Computational Fluid Dynamics)-programs has little or no support in his work. The scientific engineer who often runs his application on a parallel computer somewhere in the WAN (Wide Area Network) and visualizes the enormous amounts of simulation data on a graphical workstation in his LAN (Local Area Network) has needs which are by far not covered by state of the art visualization systems. The tool proposed here follows a strategy which differs completely from existing, batch-oriented and strictly sequential methods of the working process in the application cycle of parallel HPC applications. It allows both on-line visualization and interactive program steering of massively parallel CFD-applications. The parameters of the mathematical model and the numerical methods build objects of a database which can be accessed by an object-oriented graphical user interface via visualization and modification operators. Experiences with this new tool concept VIPER (VIsualization of Parallel numerical simulation algorithms for Extended Research) applied on a real-world and industrial scientific application will be shown.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123529551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 46
A hybrid interconnection network for integrated communication services 用于综合通信业务的混合互联网络
Proceedings 11th International Parallel Processing Symposium Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580924
Yilong Chen, Jyh-Charn S. Liu
{"title":"A hybrid interconnection network for integrated communication services","authors":"Yilong Chen, Jyh-Charn S. Liu","doi":"10.1109/IPPS.1997.580924","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580924","url":null,"abstract":"This paper presents a hybrid interconnection network architecture to support integrated communication services for multicomputer-based database and multimedia systems. Our study shows that existing wormhole routing networks are inefficient in transfer of long files. We demonstrate the feasibility of integrating different network techniques based on virtual channels and flexible routing mechanisms.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"521 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124483496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
O(log log n) time algorithms for Hamiltonian-suffix and min-max-pair heap operations on hypercube multicomputers 超立方体多计算机上的hamilton -suffix和min-max对堆操作的O(log log n)时间算法
Proceedings 11th International Parallel Processing Symposium Pub Date : 1997-04-01 DOI: 10.1109/IPPS.1997.580947
Sajal K. Das, M. C. Pinotti
{"title":"O(log log n) time algorithms for Hamiltonian-suffix and min-max-pair heap operations on hypercube multicomputers","authors":"Sajal K. Das, M. C. Pinotti","doi":"10.1109/IPPS.1997.580947","DOIUrl":"https://doi.org/10.1109/IPPS.1997.580947","url":null,"abstract":"We present an efficient mapping of a min-max-pair heap of size N on a hypercube multicomputer of p processors in such a way the load on each processor's local memory is balanced and no additional communication overhead is incurred for implementation of the single insertion, deletemin and deletemax operations. Our novel approach is based on an optimal mapping of the paths of a binary heap into a hypercube such that in O(log N/p+log p) time we can compute the Hamiltonian-suffix, which is defined as a pipelined suffix-minima computation on an O(log N)length heap path embedded into the Hamiltonian path of the hypercube according to the binary reflected Gray codes. However the binary tree underlying the heap data structure is not altered by the mapping process.","PeriodicalId":145892,"journal":{"name":"Proceedings 11th International Parallel Processing Symposium","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128462592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信