{"title":"A fast parallel sorting algorithm on the k-dimensional reconfigurable mesh","authors":"Ju-wook Jang, Kichul Kim","doi":"10.1109/ICAPP.1997.651519","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651519","url":null,"abstract":"We presents a new parallel sorting algorithm on the k-dimensional reconfigurable mesh which is a generalized version of the well-studied (two dimensional) reconfigurable mesh. We introduce a new mapping technique which combines the enlarged bandwidth of the multidimensional mesh and the feature of the reconfigurable mesh. Using our mapping technique, we show that N/sup k/ numbers can be sorted in O(4/sup k/) (constant time for small k) time on a k+1 dimensional reconfigurable mesh of size k+1 times N/spl times/N/spl times/.../spl times/N. In addition, it is shown that the number of 1's in a 0/1 array of k times size N/spl times/N/spl times/.../spl times/N can be computed in O(log* N+log k) time on reconfigurable k times mesh of size N/spl times/N/spl times/.../spl times/N.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126699787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallelization of the H.261 video coding algorithm on the IBM SP2(R) multiprocessor system","authors":"N. Yung, K. Leung","doi":"10.1109/ICAPP.1997.651523","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651523","url":null,"abstract":"In this paper, the parallelization of the H.261 video coding algorithm on the IBM SP2 multiprocessor system is described. Based on domain decomposition as a framework, data partitioning, data dependencies and communication issues are carefully assessed. From these, two parallel algorithms were developed. The first one maximizes processor utilization and the second one minimizes communications. Our analysis shows that the first algorithm exhibits poor scalability and high communication overhead; and the second algorithm exhibits good scalability and low communication overhead. A best median speed up of 13.72 or 11 frames/sec was achieved on 24 processors.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"140 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123367540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallelization of IP-packet filter rules","authors":"Takeshi Miei, M. Maruyama, T. Ogura, N. Takahashi","doi":"10.1109/ICAPP.1997.651506","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651506","url":null,"abstract":"A compiler for parallelizing IP-packet filter rules is presented which will improve network security and reduce packet-forwarding performance degradation. It analyzes the interdependence of packet-filtering rules specified by a network administrator and translates them into an intermediate program whose instructions can be executed in parallel. Three types of compiler operations are introduced: division is used to divide the rules into parallel expressions, simplification is used to simplify redundant rules, deletion is used to delete infeasible rules.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"400 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122856083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A fibre channel-based architecture for Internet multimedia server clusters","authors":"Shenze Chen, M. Thapar","doi":"10.1109/ICAPP.1997.651512","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651512","url":null,"abstract":"In this paper, we present a cluster architecture for Internet multimedia servers, which uses the Fibre Channel (FC) technology to overcome some of the shortcomings of existing architectures. We also explore the design issues of an FC-based multimedia server cluster. A significant advantage of the FC-based cluster is that it allows physical storage attachment to the interconnect. Because of this feature, FC-based clusters will change the fundamental data-sharing paradigm of existing clusters by eliminating remote data accesses in a cluster. Many aspects of this architecture are critical to real-time multimedia applications, such as audio and video services.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114803689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An enhanced 2D buddy strategy for submesh allocation in mesh networks","authors":"T. Juang, Y. Tseng, Yuh-Shyan Chen","doi":"10.1109/ICAPP.1997.651503","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651503","url":null,"abstract":"The efficient allocation problem plays an important role in partitionable multiprocessor system. It is critical to the performance of parallel computers, especially for large-scale parallel computers. In this paper, we propose a new enhanced two-dimensional buddy system (E2DBS) strategy which overcomes the drawbacks of previous two-dimensional buddy system (2DBS) strategy, such as four non-buddy submeshes can be allocated, the requesting tasks and the system needs not be square. In E2DBS, we propose an adaptive data structure, called free sub-mesh matrix (FSM), to maintain the free submeshes, which can allocate and deallocate processors easily. Simulation results indicate that our strategy outperforms the previous ones, i.e. 2DBS strategy and best fit strategy, in terms of system processor utilization and average waiting time under various system loads for rectangle requesting tasks with side lengths are powers of 2.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124860290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An FPGA-based on-line neural system in photon counting intensified imagers for space applications","authors":"M. Alderighi, S. D'Angelo, G. Sechi, F. d'Ovidio","doi":"10.1109/ICAPP.1997.651530","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651530","url":null,"abstract":"A computational system based on a synchronous feedback neural network for the online event processing of a photon counting intensified CCD detector is presented. The hardware prototype, implemented by means of FPGA technology, consists of 5/spl times/5 and is able to identify photon events against spurious and/or noise events. It shows a high level of flexibility, which is essential in the characterization phase of the detector. It allows to implement different kinds of neurons, having different output functions and internal architectures, and to run actual, as well as virtual, networks of neurons.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"163 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113949330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Wittenburg, M. Ohmacht, J. Kneip, W. Hinrichs, P. Pirsch
{"title":"HiPAR-DSP: a parallel VLIW RISC processor for real time image processing applications","authors":"J. Wittenburg, M. Ohmacht, J. Kneip, W. Hinrichs, P. Pirsch","doi":"10.1109/ICAPP.1997.651487","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651487","url":null,"abstract":"Derived from a thorough analysis of a wide class of image processing algorithms' properties, a parallel RISC architecture has been developed. The architecture gains performance from data level parallelism as well as from instruction level parallelism. From the beginning of the concept phase, high-level programming capabilities have been one of the major design goals. Thus, there has been a steady interaction between the design of the software development toolkit-optimizing assembler and C++ compiler-and the architecture itself. The RISC-typical register files are one of the most critical elements as well concerning die size and clock frequency as the assembler's ability in VLIW scheduling. Running at 100 MHz (200 mm/sup 2/, 0.35 /spl mu/m CMOS) the processor reaches a sustained performance of more than 2 GOPS for a wide range of image processing algorithms.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122737470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Virtual parallel processors","authors":"C. Dick, F. Harris","doi":"10.1109/ICAPP.1997.651485","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651485","url":null,"abstract":"The introduction of SRAM-based field programmable gate arrays (FPGAs) has opened-up a new dimension to parallel computing architectures. This paper describes an alternative approach to parallel computing-reconfigurable or virtual parallel processing (VPP). Rather than mapping an application onto a given parallel machine, the VPP approach synthesizes the appropriate type and number of processing elements, as well as the interconnection topology, that is optimal for the application. For each application, configuration data is downloaded to the machine that personalizes the hardware for the task at hand. The paper provides a brief description of the authors reconfigurable computer, Archimedes. The benefits of the VPP approach are highlighted by an example application-the 2-D FFT. A novel parallel implementation of a polynomial transform based 2-D transform is described and compared to results for distributed memory parallel machines that have been reported in the literature. The comparison highlights the computational advantage provided by reconfigurable computing.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129402902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Shadow Stacks-a hardware-supported DSM for objects of any granularity","authors":"S. Groh, M. Pizka, J. Rudolph","doi":"10.1109/ICAPP.1997.651493","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651493","url":null,"abstract":"This paper presents a new Distributed Shared Memory (DSM) management concept that is integrated into a scalable distributed virtual memory management technique and circumvents false sharing while still preserving simplicity to the application level. Objects defined as usual by variables in the declaration part of functions are made sharable among threads executing in the distributed environment. These objects of varying granularity and with different consistency requirements are managed separately to avoid false sharing. Consistency is enforced at runtime by a distributed manager-agent architecture, that supports automatic and dynamic selection of an adequate coherence protocol per object. To provide efficiency, the implementation of the Shadow Stacks concept is based on the exploitation of the page fault mechanism provided by of the shelf hardware.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128578821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. Petriu, A. Guergachi, G. Patry, L. Zhao, D. Petriu, G. Vukovich
{"title":"Artificial neural architecture for real time modelling applications","authors":"E. Petriu, A. Guergachi, G. Patry, L. Zhao, D. Petriu, G. Vukovich","doi":"10.1109/ICAPP.1997.651529","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651529","url":null,"abstract":"This paper presents the random-pulse machine concept and shows how it can be used for the modular design of artificial neural networks. Random-pulse machines deal with analog variables represented by the mean rate of random-pulse streams and use simple digital technology to perform arithmetic and logic operations. As an application example, a NN is proposed for modeling of the activated sludge wastewater treatment plants.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116420418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}