Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing最新文献

筛选
英文 中文
Speculative parallel graph reduction of lambda calculus to deferred substitution form 推测并行图将λ演算简化为递延代换形式
Yong-Hack Lee, Suh-Hyun Cheon
{"title":"Speculative parallel graph reduction of lambda calculus to deferred substitution form","authors":"Yong-Hack Lee, Suh-Hyun Cheon","doi":"10.1109/ICAPP.1997.651496","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651496","url":null,"abstract":"In a parallel graph reduction system, speculative evaluation can increase parallelism but waste machine resources by evaluating expression which may eventually be discarded. When a speculative task reduces a lambda expression to WHNF (Weak Head Normal Form), substitution can lead to unbounded growth of the graph size and require copy operation. This speculative task may be unnecessary. In that case the performance is affected by the overheads to terminate all tasks to be propagated from a speculative task and to refresh the memory cells to be allocated for copy operation. We propose a lambda form called DSF (Deferred Substitution Form) which substitution is deferred until a mandatory task will evaluate substitution. In a speculative task to DSF, since there is no substitution. It cannot grow the graph size and require copy operation. Therefore the overhead can be decreased when a expression reduced to DSF is eventually unnecessary. In addition we propose an evaluation model for DSF to increase the parallelism.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124164347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distributed parallel generation of indices for very large text databases 大型文本数据库索引的分布式并行生成
João Paulo W. Kitajima, M. D. Resende, B. Ribeiro-Neto, N. Ziviani
{"title":"Distributed parallel generation of indices for very large text databases","authors":"João Paulo W. Kitajima, M. D. Resende, B. Ribeiro-Neto, N. Ziviani","doi":"10.1109/ICAPP.1997.651539","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651539","url":null,"abstract":"We propose a new algorithm for the parallel generation of suffix arrays for large text databases on high-bandwidth computer networks. Suffix arrays are structures used in full text indexing which support very powerful query languages. Our algorithm is based on a parallel indirect mergesort (it is not a simple mergesort procedure) and is compared with a well known sequential algorithm (which is very efficient running on a single machine). Although network-bounded, the parallel version is theoretically and experimentally a much better alternative when compared to the sequential version (which is I/O-bounded in disk).","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"149 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129317478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Generating communication sets efficiently on data-parallel programs 在数据并行程序上有效地生成通信集
Tsung-Chuan Huang, L. Shiu, Cherng-Haw Yu
{"title":"Generating communication sets efficiently on data-parallel programs","authors":"Tsung-Chuan Huang, L. Shiu, Cherng-Haw Yu","doi":"10.1109/ICAPP.1997.651505","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651505","url":null,"abstract":"Generating local memory access sequences and communication sets efficiently is an important issue while compiling a data-parallel language into a SPMD (Single Program Multiple Data) code. Recently, several approaches have been presented; they are based on the case in which array references are distributed across arbitrary number of processors with arbitrary block sizes using block-cyclic distribution. Typically, in order to generate explicit communication sets, each node program has to scan over the local memory access sequences. In this paper, we focus on two cases. First, array references are aligned to a common template and this template is distributed across processors using block-cyclic distribution. Second, array references are distributed across the same number of processors with same block size. The first case is further classified into one-level and two-level mappings. We construct a block state graph to generate communication sets by scanning only a portion of local memory access sequence. In one-level mappings and the second case, we only need to scan the active elements among the first s local active blocks; while in two-level mappings, only need to scan the active elements among the first /spl alpha/*s local active blocks, where s is the stride of regular section and a is the stride of alignment function. As a result, the efficiency can be greatly improved.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134639136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
High performance computing on networks of workstations through the exploitation of function parallelism 利用函数并行性在工作站网络上进行高性能计算
Yung-Lin Liu, Hau-Yang Cheng, C. King
{"title":"High performance computing on networks of workstations through the exploitation of function parallelism","authors":"Yung-Lin Liu, Hau-Yang Cheng, C. King","doi":"10.1109/ICAPP.1997.651514","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651514","url":null,"abstract":"Parallel programs are often written in the SPMD (single-program-multiple-data) form for exploiting data parallelism in the applications. In this paper, we show that even in SPMD programs further parallelism can be extracted by considering the function parallelism in the programs. Exploiting function parallelism is especially important for parallel systems using the NOW (network of workstations) approach. This is because the high communication overhead in such systems can be hidden with explicit control over the function parallelism. In this paper we describe a general methodology for exploiting function parallelism in SPMD programs and discuss the considerations involved in realizing such parallelism with the multithreading facility supported by most workstations today. The resultant multithreaded parallel program is still coded in the SPMD form. We demonstrate the application of this technique to a PDE solver, which solves a system of linear equations using Jacobi relaxation. Experiments on an 8-node NOW confirm that the performance of an SPMD program can be improved further by exploiting its function parallelism.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123705138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Determination of an optimal processor allocation in the design of massively parallel processor arrays 大规模并行处理器阵列设计中处理器最优配置的确定
D. Fimmel, R. Merker
{"title":"Determination of an optimal processor allocation in the design of massively parallel processor arrays","authors":"D. Fimmel, R. Merker","doi":"10.1109/ICAPP.1997.651500","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651500","url":null,"abstract":"In this paper we consider the determination of allocation functions as a part of the design of massively parallel processor arrays for algorithms which can be represented as systems of uniform recurrence equations. The objective is to find allocation functions minimizing the necessary chip area for a hardware implementation of the processor array. We propose an algorithm approximately minimizing the number of processors under consideration of the necessary chip area needed to implement the processors of the processor array. The arising optimization problems can be solved using integer linear programming.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121287631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Lazy decomposition: a novel technique to control parallel task granularity 延迟分解:一种控制并行任务粒度的新技术
Suntae Hwang, H. Cha
{"title":"Lazy decomposition: a novel technique to control parallel task granularity","authors":"Suntae Hwang, H. Cha","doi":"10.1109/ICAPP.1997.651511","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651511","url":null,"abstract":"This paper introduces a new mechanism for the exposure of large grain parallelism. The scheme performs lazy task creation; inlining all tasks provisionally and extracting parallelism from the inlined information later on demand. However, unlike other mechanisms, the further task demand is satisfied by the next evaluation stream rather than retrospectively reversing the inlining decision of the current stream. The scheme is called lazy decomposition because decomposition itself is throttled rather than just the extraction of a task. Lazy decomposition makes the serial section clearly separated from the parallel section in an evaluation tree for a particular function, and this allows the serial section to adopt a sequential algorithm. The performance improvement is significant in divide-and-conquer applications by adoption of sequential algorithms.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115290902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A new heuristic algorithm based on GAs for multiprocessor scheduling with task duplication 一种基于GAs的任务重复多处理机调度新算法
T. Tsuchiya, T. Osada, T. Kikuno
{"title":"A new heuristic algorithm based on GAs for multiprocessor scheduling with task duplication","authors":"T. Tsuchiya, T. Osada, T. Kikuno","doi":"10.1109/ICAPP.1997.651499","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651499","url":null,"abstract":"In this paper, we propose a new algorithm for scheduling parallel programs represented as directed acyclic graphs onto multiprocessors with communication delays. In such systems, task duplication is known as a useful technique for shortening the length of schedules. The proposed algorithm adopts several heuristics based on GAs as well as task duplication. To apply a GA to scheduling, we design chromosomes using list representation so that each chromosome can uniquely represent a schedule of tasks. We also design genetic operators to control the degree of replication of tasks. Through simulation studies for three kinds of parallel programs under various scheduling conditions, we compare the proposed algorithm with an established algorithm proposed by Kruatrachue. As a result, it is found that the new heuristic algorithm outperforms the previous algorithm especially when communication delays are relatively small.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126869005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
A systolic architecture for sorting an arbitrary number of elements 对任意数量的元素进行排序的一种收缩结构
S. Zheng, S. Olariu, M. C. Pinotti
{"title":"A systolic architecture for sorting an arbitrary number of elements","authors":"S. Zheng, S. Olariu, M. C. Pinotti","doi":"10.1109/ICAPP.1997.651484","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651484","url":null,"abstract":"We propose a simple systolic VLSI sorting architecture whose main feature is the pipelined use of a sorting network of fixed I/O size p to sort an arbitrarily large data set of N elements. Our architecture is feasible for VLSI implementation and its time performance is virtually independent of the cost and depth of the underlying sorting network. Specifically, we show that by using our design N elements can be sorted in /spl Theta/(N/p log N/p) time without memory access conflicts. We also show how to use an AT/sup 2/-optimal sorting network of fixed I/O size p to construct a similar systolic architecture that sorts N elements in /spl Theta/(N/p log N/plogp) time.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134576578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A simulator construction methodology for the Shiva multiprocessor system Shiva多处理器系统的模拟器构建方法
S. Slomka, K. Sterzl, V. Lakshmi Narasimhan
{"title":"A simulator construction methodology for the Shiva multiprocessor system","authors":"S. Slomka, K. Sterzl, V. Lakshmi Narasimhan","doi":"10.1109/ICAPP.1997.651490","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651490","url":null,"abstract":"This paper describes a simulator for the Shiva multiprocessor system and the simulator construction methodology (SCM) used in its creation. The SCM, based on the active functional unit (AFU) construct, is a modern SCM which is flexible, accurate, fast, easy to use, capable of dynamic reconfigurability at run-time, and most of all simple and capable of quick simulator construction. The AFU SCM is capable of all these things through the use of object-oriented software techniques. The Shiva simulator constructed using the AFU SCM is program-driven and capable of micro and macro architectural simulation.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"124 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133565790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Network enabled solvers for scientific computing using the NetSolve system 使用NetSolve系统进行科学计算的网络求解器
H. Casanova, J. Dongarra
{"title":"Network enabled solvers for scientific computing using the NetSolve system","authors":"H. Casanova, J. Dongarra","doi":"10.1109/ICAPP.1997.651477","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651477","url":null,"abstract":"Agent-based computing is increasingly regarded as an elegant and efficient way of providing access to computational resources. Several metacomputing research projects are using intelligent agents to manage a resource space and to map user computation to these resources in an optimal fashion. Such a project is NetSolve, developed at the University of Tennessee and Oak Ridge National Laboratory. NetSolve provides the user with a variety of interfaces that afford direct access to preinstalled, freely available numerical libraries. These libraries are embedded in computational servers. New numerical functionalities can be integrated easily into the servers by a specific framework. The NetSolve agent manages the coherency of the computational servers. It also uses predictions about the network and processor performances to assign user requests to the most suitable servers. This article reviews some of the basic concepts in agent-based design, discusses the NetSolve project and how its agent enhances flexibility and performance, and provides examples of other research efforts. Also discussed are future directions in agent-based computing in general and in NetSolve in particular.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133708540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信