Proceedings. Third Working Conference on Massively Parallel Programming Models (Cat. No.97TB100228)最新文献

筛选
英文 中文
Aspects of the compilation of nested parallel imperative languages 嵌套并行命令式语言的编译方面
W. Pfannenstiel, M. Dahm, M. Chakravarty, Stefan Jähnichen, G. Keller, F. Schroer, M. Simons
{"title":"Aspects of the compilation of nested parallel imperative languages","authors":"W. Pfannenstiel, M. Dahm, M. Chakravarty, Stefan Jähnichen, G. Keller, F. Schroer, M. Simons","doi":"10.1109/MPPM.1997.715966","DOIUrl":"https://doi.org/10.1109/MPPM.1997.715966","url":null,"abstract":"We report on our experiences with the implementation of the imperative nested parallel language V. We give an overview of the compiler and a description of its building blocks and their interplay. We show how functional and imperative constructs such as control structures and pointers are handled by transformation rules. We justify additional restrictions that had to be placed on side effects and imperative constructs and were not initially thought to be necessary.","PeriodicalId":217385,"journal":{"name":"Proceedings. Third Working Conference on Massively Parallel Programming Models (Cat. No.97TB100228)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116856198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Parallel programming and complexity analysis using Actors 使用actor进行并行编程和复杂性分析
Gul A. Agha, Wooyoung Kim
{"title":"Parallel programming and complexity analysis using Actors","authors":"Gul A. Agha, Wooyoung Kim","doi":"10.1109/MPPM.1997.715963","DOIUrl":"https://doi.org/10.1109/MPPM.1997.715963","url":null,"abstract":"We describe Actors, a flexible, scalable and efficient model of computation, and develop a framework for analyzing the parallel complexity of programs written in it. Actors are asynchronous, autonomous objects which interact by message-passing. The data and process decomposition inherent in Actors simplifies modeling real-world systems. High-level concurrent programming abstractions have been developed to simplify program development using Actors; such abstractions do not compromise an efficient and portable implementation. In this paper, we define a parallel complexity model for Actors. The model we develop gives an accurate measure of performance on realistic architectures. We illustrate its use by analyzing a number of examples.","PeriodicalId":217385,"journal":{"name":"Proceedings. Third Working Conference on Massively Parallel Programming Models (Cat. No.97TB100228)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124620452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Variable grain architectures for MPP computation and structured parallel programming MPP计算和结构化并行编程的可变粒度结构
M. Vanneschi
{"title":"Variable grain architectures for MPP computation and structured parallel programming","authors":"M. Vanneschi","doi":"10.1109/MPPM.1997.715969","DOIUrl":"https://doi.org/10.1109/MPPM.1997.715969","url":null,"abstract":"The paper discusses the relationships between hierarchically composite MPP architectures and the software technology derived from the structured parallel programming methodology, in particular the architectural support to successive modular refinements of parallel applications, and the architectural support to the parallel programming paradigms and their combinations. The structured parallel programming methodology referred here is an application of the Skeletons model. The considered hierarchically composite architectures are MPP machine models for PetaFlops computing, composed of proper combinations of current architectural models of different granularities, where the Processors-In-Memory model is adopted at the finest granularity level. The methodologies are discussed with reference to the current PQE2000 Project on MPP general purpose systems.","PeriodicalId":217385,"journal":{"name":"Proceedings. Third Working Conference on Massively Parallel Programming Models (Cat. No.97TB100228)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129417146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
BaLinda: a simple parallel programming model BaLinda:一个简单的并行编程模型
C. Yuen, M. Feng
{"title":"BaLinda: a simple parallel programming model","authors":"C. Yuen, M. Feng","doi":"10.1109/MPPM.1997.715957","DOIUrl":"https://doi.org/10.1109/MPPM.1997.715957","url":null,"abstract":"This paper argues for the development of more general and user-friendly parallel programming models, independent of hardware structures and concurrency concepts of operating systems theory, leading to portable programs and easy to use languages. It then presents the BaLinda model, based on last in/first out threads that interact via a shared tuplespace, and argues that it is simple enough to be both general and easy to use. It also discusses the idea of using function-based objects as the basic unit of parallel execution and the hierarchical structure to partition tuplespaces.","PeriodicalId":217385,"journal":{"name":"Proceedings. Third Working Conference on Massively Parallel Programming Models (Cat. No.97TB100228)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122992428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Datarol: a parallel machine architecture for fine-grain multithreading Datarol:用于细粒度多线程的并行机器架构
M. Amamiya, H. Tomiyasu, S. Kusakabe
{"title":"Datarol: a parallel machine architecture for fine-grain multithreading","authors":"M. Amamiya, H. Tomiyasu, S. Kusakabe","doi":"10.1109/MPPM.1997.715971","DOIUrl":"https://doi.org/10.1109/MPPM.1997.715971","url":null,"abstract":"We discuss a design principle of massively parallel distributed-memory multiprocessor architecture which solves latency problem, and present the Datarol machine architecture. Latencies, caused by remote memory access and remote procedure call, are most serious problems in massively parallel computers. In order to eliminate the processor idle times caused by these latencies, processors must perform fast context switching among fine-grain concurrent processes. First, we present a processor architecture, called Datarol-II, that promotes efficient fine-grain multithread execution by performing fast context switching among fine-grain concurrent processes. In the Datarol-II processor, an implicit register load/store mechanism is embedded in the execution pipeline in order to reduce memory access overhead caused by context switching. In order to reduce local memory access latency, a two-level hierarchical memory system and a load control mechanism are also introduced. Then, we present a cost-effective design of the Datarol-II processor, which incorporates off-the-shelf high-end microprocessor while preserving the fine-grain dataflow concept. The off-the-shelf microprocessor Pentium is used for its core processing, and a co-processor called FMP (Fine-grain Message Processor) is designed for fine grained message handling and communication controls. The co-processor FMP is designed on the basis of FMD (Fine-grain Message Driven) execution model, in which fine-grain multi-threaded execution is driven and controlled by simple fine-grain message communications.","PeriodicalId":217385,"journal":{"name":"Proceedings. Third Working Conference on Massively Parallel Programming Models (Cat. No.97TB100228)","volume":"508 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123065345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
(De) composition rules for parallel scan and reduction (二)平行扫描和还原的成分规则
S. Gorlatch, C. Lengauer
{"title":"(De) composition rules for parallel scan and reduction","authors":"S. Gorlatch, C. Lengauer","doi":"10.1109/MPPM.1997.715958","DOIUrl":"https://doi.org/10.1109/MPPM.1997.715958","url":null,"abstract":"We study the use of well-defined building blocks for SPMD programming of machines with distributed memory. Our general framework is based on homomorphisms, functions that capture the idea of data-parallelism and have a close correspondence with collective operations of the MPI standard, e.g., scan and reduction. We prove two composition rules: under certain conditions, a composition of a scan and a reduction can be transformed into one reduction, and a composition of two scans into one scan. As an example of decomposition, we transform a segmented reduction into a composition of partial reduction and all-gather. The performance gain and overhead of the proposed composition and decomposition rules are assessed analytically for the hypercube and compared with the estimates for some other parallel models.","PeriodicalId":217385,"journal":{"name":"Proceedings. Third Working Conference on Massively Parallel Programming Models (Cat. No.97TB100228)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131182993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Scalable multicomputer object spaces: a foundation for high performance systems 可伸缩的多计算机对象空间:高性能系统的基础
S. Blackburn, R. Stanton
{"title":"Scalable multicomputer object spaces: a foundation for high performance systems","authors":"S. Blackburn, R. Stanton","doi":"10.1109/MPPM.1997.715974","DOIUrl":"https://doi.org/10.1109/MPPM.1997.715974","url":null,"abstract":"The development of scalable architectures at store levels of a layered model has concentrated on processor parallelism balanced against scalable memory bandwidth, primarily through distributed memory structures of one kind or another. A great deal of attention has been paid to hiding the distribution of memory to produce a single store image across the memory structure. It is unlikely that the distribution and concurrency aspects of scalable computing can be completely hidden at that level. This paper argues for a store layer which respects the need for caching and replication, and to do so at an \"object\" level granularity of memory use. These facets are interrelated through atomic processes, leading to an interface for the store which is strongly transactional in character. The paper describes the experimental performance of such a layer on a scalable multi-computer architecture. The behaviour of the store supports the view that a scalable cached \"transactional\" store architecture is a practical objective for high performance based on parallel computation across distributed memories.","PeriodicalId":217385,"journal":{"name":"Proceedings. Third Working Conference on Massively Parallel Programming Models (Cat. No.97TB100228)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121367238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A parallel programming model for irregular dynamic neural networks 不规则动态神经网络的并行规划模型
L. Prechelt
{"title":"A parallel programming model for irregular dynamic neural networks","authors":"L. Prechelt","doi":"10.1109/MPPM.1997.715977","DOIUrl":"https://doi.org/10.1109/MPPM.1997.715977","url":null,"abstract":"The compilation of high-level programming languages for parallel machines faces two challenges: maximizing data/process locality and balancing load. No solutions for the general case are known that solve both problems at once. The present paper describes a programming model that allows to solve both problems for the special case of neural network learning algorithms, even for irregular networks with dynamically changing topology (constructive neural algorithms). The model is based on the observation that such algorithms predominantly execute local operations (on nodes and connections of the network), reductions, and broadcasts. The model is concretized in an object-centered procedural language called CuPit. The language is completely abstract: No aspects of the parallel implementation such as number of processors, data distribution, process distribution, execution model etc. are visible in user programs. The compiler can derive most information relevant for the generation of efficient code from unannotated source code. Therefore, CuPit programs are efficiently portable. A compiler for CuPit has been built for the MasPar MP-1/MP-2 using compilation techniques that can also be applied to most other parallel machines. The paper shortly presents the main ideas of the techniques used and results obtained by the various optimizations.","PeriodicalId":217385,"journal":{"name":"Proceedings. Third Working Conference on Massively Parallel Programming Models (Cat. No.97TB100228)","volume":"177 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116398808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Performance driven programmimg models 性能驱动的编程模型
W. Gropp
{"title":"Performance driven programmimg models","authors":"W. Gropp","doi":"10.1109/MPPM.1997.715962","DOIUrl":"https://doi.org/10.1109/MPPM.1997.715962","url":null,"abstract":"Most projections for high-performance, massively parallel processors (MPPs) include deep and complex memory hierarchies. Making efficient use of these systems will require making efficient use of these memory hierarchies, without sacrificing the advancements that have been made in algorithms. Efficient programming models were developed for vector computers, particularly the memory system structure, providing high performance. Where are the programming models for MPPs? Much effort has gone into automatic programming systems, such as parallelizing compilers for existing languages and new languages expressing concurrency. Unfortunately, these have rarely led to programs that can achieve near-peak performance. In this paper, we review the issues and some current approaches and suggest some new memory-oriented programming models. The development of these models is essential, because, just as with vector computing, the programming model can strongly influence the new algorithms that are needed for high-performance applications on massively parallel processors.","PeriodicalId":217385,"journal":{"name":"Proceedings. Third Working Conference on Massively Parallel Programming Models (Cat. No.97TB100228)","volume":"abs/2006.07972 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127646029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Compiling and supporting skeletons on MPP 在MPP上编译和支持骨架
Susanna Pelagatti
{"title":"Compiling and supporting skeletons on MPP","authors":"Susanna Pelagatti","doi":"10.1109/MPPM.1997.715970","DOIUrl":"https://doi.org/10.1109/MPPM.1997.715970","url":null,"abstract":"Parallel programming needs a high level programming model in which compilers and run time supports take care of traditionally intractable problems related to efficient usage of the target machine (mapping, scheduling, data decomposition, etc.). The matter of designing a real system providing such a model is highly simplified by constructing the parallel programs using scalable skeletons which capture common structural components of parallel computations. The key problem is the efficient implementation of programs composed of several nested skeleton instances. This requires optimizing the resulting process graph structure and map it on the available resources in order to balance load and minimize communications. The paper describes how this can be done, despite of the intractability of the problems involved, exploiting the 'structure' imposed by the skeleton approach.","PeriodicalId":217385,"journal":{"name":"Proceedings. Third Working Conference on Massively Parallel Programming Models (Cat. No.97TB100228)","volume":"251 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131582019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信