Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing最新文献

筛选
英文 中文
An optimal parallel algorithm for the Euclidean distance maps of binary images 二值图像欧氏距离映射的最优并行算法
A. Fujiwara, T. Masuzawa, H. Fujiwara
{"title":"An optimal parallel algorithm for the Euclidean distance maps of binary images","authors":"A. Fujiwara, T. Masuzawa, H. Fujiwara","doi":"10.1109/ICAPP.1995.472293","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472293","url":null,"abstract":"The Euclidean distance map (EDM) of a black and white n/spl times/n binary image is the n/spl times/n map where each element has the Euclidean distance between the corresponding pixel and the nearest black pixel. The EDM plays an important role in machine vision, pattern recognition and robotics. Many algorithms have been proposed for computing the EDM. In recent years, O(n/sup 2/) time sequential algorithms were presented for computing the EDM. Hirata and Kato (1994) showed that their algorithm can be parallelized to run in O(n/sup 2//p) time using p processors (1/spl les/p/spl les/n) on the EREW PRAM. We present a parallel algorithm for computing the EDM. The algorithm runs in O(log n) time using n/sup 2//log n processors on the EREW PRAM and in O(log n/log log n) time using n/sup 2/ log log n/log n processors on the common CRCW PRAM, respectively. The algorithm is optimal in the sense that the product of the time and the number of processors is equal to the lower bound of the sequential time for computing the EDM.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127804549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Performance implications of virtualisation of massively parallel algorithm implementation 大规模并行算法实现虚拟化的性能影响
C. A. Farrell, D. Kieronska, M. Korda
{"title":"Performance implications of virtualisation of massively parallel algorithm implementation","authors":"C. A. Farrell, D. Kieronska, M. Korda","doi":"10.1109/ICAPP.1995.472275","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472275","url":null,"abstract":"In this paper we investigate the accuracy of performance prediction for virtualised implementations of parallel algorithms on massively parallel SIMD architectures. The main contributions of this paper are the adaption and practical evaluation of the best known algorithms for merging and sorting.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114596763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HPSF: a horizontally-divided parallel signature file method HPSF:一种水平分割的并行签名文件方法
Jeong-Ki Kim, Jae-Woo Chang
{"title":"HPSF: a horizontally-divided parallel signature file method","authors":"Jeong-Ki Kim, Jae-Woo Chang","doi":"10.1109/ICAPP.1995.472242","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472242","url":null,"abstract":"In order to achieve good performance, the signature file approach has been required to support parallel database processing. Therefore, in this paper we propose a horizontally-divided parallel signature file method (HPSF) using extendible hashing and frame-slicing techniques. In addition, we propose a heuristic processor allocation methods so that we may assign signatures into a given number of processors in a uniform way. To show the efficiency of HPSF, we evaluate the performance of HPSF in terms of retrieval time, storage overhead, and insertion time. Finally, we show from the performance results that HPSF outperforms the conventional parallel signature file methods on retrieval performance as well as insertion time.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115092047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Architectural characteristics and hardware cost of a class of interconnection networks 一类互连网络的体系结构特点和硬件成本
M. Hamdi
{"title":"Architectural characteristics and hardware cost of a class of interconnection networks","authors":"M. Hamdi","doi":"10.1109/ICAPP.1995.472177","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472177","url":null,"abstract":"A new class of interconnection networks is proposed for interconnecting the processors of a general purpose parallel computer which is based on the hierarchical application of a complete graph compound. The systematic construction of this new class of interconnection networks, RCC, is shown and its properties are derived and are compared favorably to other interconnection networks. A specific instance of this class, RCC-CUBE, is shown to have desirable network properties such as small diameter, small degree, high density, and high bandwidth. The hardware cost and physical time performance are estimated for RCC-CUBE and compared to those of the hypercube and the 2-D mesh demonstrating an overall cost-effectiveness for RCC-CUBE. Thus, the RCC-CUBE appears to be a good candidate for next generation massively parallel computer systems.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116924657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scheduling of precedence constrained tasks on multiprocessor systems 多处理器系统中优先级受限任务的调度
Chih-Ming Yen, S. Tseng, Chao-Tung Yang
{"title":"Scheduling of precedence constrained tasks on multiprocessor systems","authors":"Chih-Ming Yen, S. Tseng, Chao-Tung Yang","doi":"10.1109/ICAPP.1995.472208","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472208","url":null,"abstract":"The problem of scheduling a set of precedence constraint tasks onto a finite number of identical processors with and without communication overhead is studied. The objective is to minimize the makespan. In this paper, we are concerned with a priority-list scheduling method. A new policy for ranking the priority of each task is proposed. Under this priority policy, two heuristic algorithms are proposed to solve task scheduling problems with and without communication overheads. Experiments show that our algorithm for solving the problem without communication overhead improves previous result by about 20%; for problems with communication overhead the improvement is about 70% over previous work.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"165 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116396379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
An improvement to dynamic stream handling in dataflow computers 数据流计算机中动态流处理的改进
V. Lakshmi, C. Arnold
{"title":"An improvement to dynamic stream handling in dataflow computers","authors":"V. Lakshmi, C. Arnold","doi":"10.1109/ICAPP.1995.472306","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472306","url":null,"abstract":"This paper presents a new method of implementing dynamic streams of streams using token relabelling which reduces the complexity and drawbacks of the previously proposed method due to Gaudiot. Consider a sequence of tokens, Vi/sub [ui]/, which will appear in sequence on the stream-carrying arc. Two tokens Va/sub [ux]/ and Vb/sub [uy]/, will be considered belonging to the same stream if they have the same context: [ux]=[uy]. Elements within a stream are ordered according to the sequence in time that they appear on the arc. Let the highest level of streams has the context [uO], that of the surrounding block. Thus the highest level stream is the sequence of values Vi/sub [uO]/. Each element of this stream has as its value a unique context, namely, that of the stream that it represents. So the token Vi/sub [uO]/ identifies as a stream the sequence of tokens whose context is [Vi].<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129003729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reduced reachability graphs with parallel actions and dynamic replacement 通过并行操作和动态替换减少了可达性图
H. Mountassir
{"title":"Reduced reachability graphs with parallel actions and dynamic replacement","authors":"H. Mountassir","doi":"10.1109/ICAPP.1995.472231","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472231","url":null,"abstract":"Analysis of communication protocols by the conventional state exploration is a well known technique. It is actually implemented in several tools of validation. The major problem of this technique is its restricted applicability and depends on the available memory. The number of reachable states is often large and sometimes infinite. In this paper we discuss a reduction technique to build small graphs as possible which preserve same properties. At this end vectors of executable actions are proposed to eliminate redundancy of sequences and intermediate states. The depth-first and the breadth-first algorithms based on the concept of dynamic replacement are used in the order to reduce the final graphs. Two major questions are discussed: the finiteness of the graphs and the verification of the communication properties.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130887462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Algorithmic aspects and computing trends in computational electromagnetics using massively parallel architectures 使用大规模并行架构的计算电磁学的算法方面和计算趋势
C. Rowell, V. Shankar, W. Hall, A. Mohammadian
{"title":"Algorithmic aspects and computing trends in computational electromagnetics using massively parallel architectures","authors":"C. Rowell, V. Shankar, W. Hall, A. Mohammadian","doi":"10.1109/ICAPP.1995.472266","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472266","url":null,"abstract":"Accurate and rapid evaluation of radar signature for alternative aircraft/store configurations would be of substantial benefit in the evolution of integrated designs that meet radar cross section requirements across the threat spectrum. Finite-volume time domain methods offer the possibility of modeling the whole aircraft, including penetrable regions and stores, at longer wavelengths on today's supercomputers and at typical airborne radar wavelengths on the massively parallel teraflop computers of tomorrow. To realize this potential, practical means are being developed for the rapid generation of grids on and around the aircraft, and numerical algorithms that maintain high order accuracy on such grids are being constructed. A structured grid and an unstructured grid based finite-volume, time-domain Maxwell's equation solver has been developed incorporating modeling techniques for general radar absorbing materials. Using this work as a base, the goal of the computational electromagnetics effort is to define, implement, and evaluate rapid prototype signature prediction, addressing many issues related to (1) physics of electromagnetics, (2) efficient and higher-order accurate algorithms, (3) boundary condition procedures, (4) geometry and gridding (structured and unstructured), (5) computer architecture, and (6) validation.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125349277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Optimization in a hierarchical distributed performance monitoring system 分层分布式性能监控系统的优化
Ling Shi, O. de Vel, Jiannong Cao, M. Cosnard
{"title":"Optimization in a hierarchical distributed performance monitoring system","authors":"Ling Shi, O. de Vel, Jiannong Cao, M. Cosnard","doi":"10.1109/ICAPP.1995.472238","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472238","url":null,"abstract":"Monitoring program execution in a distributed system can generate large quantities of data, and the collection and processing of the monitoring data is one of the primary factors that contribute to the complexity of distributed monitoring. In order to reduce such complexity, a hierarchical distributed performance monitoring system has been developed. In this paper we describe an optimization method to improve the efficiency of the monitoring system. By considering the topology used by the application program and the distribution of monitoring records, an optimized grouping can be determined to obtain an improved performance for the monitoring system. The experiments presented in this paper have demonstrated such an improvement in performance.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115215497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Two dynamic performance tuning methods for portable parallel programs 便携式并行程序的两种动态性能调优方法
K. Suzaki, T. Kurita, H. Tanuma, S. Hirano, Y. Ichisugi
{"title":"Two dynamic performance tuning methods for portable parallel programs","authors":"K. Suzaki, T. Kurita, H. Tanuma, S. Hirano, Y. Ichisugi","doi":"10.1109/ICAPP.1995.472244","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472244","url":null,"abstract":"We present two dynamic performance tuning methods for portable parallel programs on various parallel computers. In parallel programs the affinity between parallel algorithms and the architecture of the target parallel computer is very important. In this paper we focus on the parallelism in view of the number of micro-tasks which are processing units in parallel programs. The presented methods estimate the optimal number of micro-tasks before the parallel processing is invoked. Furthermore, they shorten the execution time of the parallel program so that it is close to the optimal execution time. The estimation is based on the result of pre-executions of the program for different sizes of the data to be processed on a target parallel computer. One tuning method uses nearest-neighbor interpolation and the other uses spline interpolation for the estimation. We tested these tuning methods using a parallel square-matrix multiplication program written in Dataparallel C on three different parallel computers; a Paragon, an iPSC/2, and an nCUBE/2. In these experiments, the method using nearest-neighbor interpolation brought the execution time closer to the optimum than did the method using spline interpolation. The nearest-neighbor interpolation method yielded average execution times, which are given in terms of the optimal execution time, of 1.01 for the Paragon, 1.005 for the iPSC/2, and 1.052 for the nCUBE/2.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"33 3-4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120999263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信