Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing最新文献_第3页

The design and analysis of parallel algorithms 并行算法的设计与分析

Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing Pub Date : 2002-01-09 DOI: 10.1109/EMPDP.2002.994219

C. Rodríguez

{"title":"The design and analysis of parallel algorithms","authors":"C. Rodríguez","doi":"10.1109/EMPDP.2002.994219","DOIUrl":"https://doi.org/10.1109/EMPDP.2002.994219","url":null,"abstract":"omputing Models provide frames for the analysis and design of algorithms. Unfortunately, the balance required between simplicity and realism makes it difficult to guarantee the necessary accuracy for the whole range of algorithms and machines. Simplicity implies a minimal number of architecture parameters (usually including computational power, bandwidth and latency). Accuracy implies just the opposite. The short history of Parallel Computing has seen the arrival (and the departure) of many proposals. Undoubtedly, the best known among those is the Parallel Random Access Machine (PRAM), the Postal/LogP Model and the Bulk Synchronous Parallel Model (BSP). From these three, the oldest one, the PRAM model, has been discarded as unrealistic. The other two, LogP and BSP, remain but do not escape of those aforementioned conflicts. Each model enforces/matches a different parallel programming style. To make the situation worse, none of these two styles agrees completely with the currently dominant style in parallel and distributed programming: MPI message passing. The talk will make emphasis on BSP, its weakness and strengths. As developing examples, we will use two programming paradigms: nested data parallelism and pipelining. C","PeriodicalId":126071,"journal":{"name":"Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131772524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Exploiting the multilevel parallelism and the problem structure in the numerical solution of stiff ODEs 利用多层并行性和问题结构在刚性ode数值解中的应用

Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing Pub Date : 2002-01-09 DOI: 10.1109/EMPDP.2002.994262

J. M. Mantas, J. Ortega, J. Carrillo

{"title":"Exploiting the multilevel parallelism and the problem structure in the numerical solution of stiff ODEs","authors":"J. M. Mantas, J. Ortega, J. Carrillo","doi":"10.1109/EMPDP.2002.994262","DOIUrl":"https://doi.org/10.1109/EMPDP.2002.994262","url":null,"abstract":"A component-based methodology to derive parallel stiff ordinary differential equation (ODE) solvers for multicomputers is presented. The methodology allows the exploitation of the multilevel parallelism of this kind of numerical algorithm and the particular structure of ODE systems by using parallel linear algebra modules. The approach promotes the reusability of design specifications and clear structuring of the derivation process. Two types of components are defined to enable the separate treatment of different aspects during the derivation of a parallel stiff ODE solver. The approach has been applied to the implementation of an advanced numerical stiff ODE solver on a PC cluster. Following the approach, the parallel numerical scheme has been optimized and adapted to the solution of two modelling problems which involve stiff ODE systems with dense and narrow banded structures respectively. Numerical experiments have been performed to compare the solver with the state-of-the-art sequential stiff ODE solver. The results show that the parallel solver performs especially well with dense ODE systems and reasonably well with narrow banded systems.","PeriodicalId":126071,"journal":{"name":"Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124052278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

The effect of local sort on parallel sorting algorithms 局部排序对并行排序算法的影响

Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing Pub Date : 2002-01-09 DOI: 10.1109/EMPDP.2002.994310

Daniel Jiménez-González, J. Navarro, J. Larriba-Pey

引用次数: 20

Reducing the latency of L2 misses in shared-memory multiprocessors through on-chip directory integration 通过片上目录集成减少共享内存多处理器中L2缺失的延迟

Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing Pub Date : 2002-01-09 DOI: 10.1109/EMPDP.2002.994312

M. Acacio, José González, José M. García, J. Duato

{"title":"Reducing the latency of L2 misses in shared-memory multiprocessors through on-chip directory integration","authors":"M. Acacio, José González, José M. García, J. Duato","doi":"10.1109/EMPDP.2002.994312","DOIUrl":"https://doi.org/10.1109/EMPDP.2002.994312","url":null,"abstract":"Recent technology improvements allow multiprocessor designers to put some key components inside the processor chip, such as the memory controller and the network interface. In this paper, we exploit such an integration scale, presenting a new three-level directory architecture aimed at reducing the long L2 miss latencies and the memory overhead that characterize cc-NUMA machines and limit their scalability. The proposed architecture is based on the integration into the processor chip of the directory controller and a small first-level directory cache that stores precise information for the most recently referenced memory lines, as the means to reduce miss latencies. The second- and third-level directories are located near the main memory and they are only accessed when a directory entry for a certain memory line is not present in the first-level directory. This off-chip structure achieves the performance of a large and non-scalable full-map directory with a very significant reduction in the memory overhead. Using execution-driven simulations, we show that substantial latency reductions can be obtained by using the proposed directory architecture. Load, store and read-modify-write misses are significantly accelerated (latency reductions of more than 35% in some cases). These reductions translate into important improvements on the final application performance (reductions up to 20% in execution time).","PeriodicalId":126071,"journal":{"name":"Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115772045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

SICOSYS: an integrated framework for studying interconnection network performance in multiprocessor systems SICOSYS:一个研究多处理器系统互连网络性能的集成框架

Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing Pub Date : 2002-01-09 DOI: 10.1109/EMPDP.2002.994207

Valentin Puente, J. Gregorio, R. Beivide

引用次数: 107

Efficient implementation of reduce-scatter in MPI MPI中reduce-scatter的高效实现

Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing Pub Date : 2002-01-09 DOI: 10.1109/EMPDP.2002.994296

M. Bernaschi, G. Iannello, Mario Lauria

引用次数: 16

Mobile agents in a distributed heterogeneous database system 分布式异构数据库系统中的移动代理

Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing Pub Date : 2002-01-09 DOI: 10.1109/EMPDP.2002.994247

Balázs Goldschmidt, Z. László, M. Döller, H. Kosch

{"title":"Mobile agents in a distributed heterogeneous database system","authors":"Balázs Goldschmidt, Z. László, M. Döller, H. Kosch","doi":"10.1109/EMPDP.2002.994247","DOIUrl":"https://doi.org/10.1109/EMPDP.2002.994247","url":null,"abstract":"The purpose of this paper is to present a new infrastructure for multimedia database searches based on CORBA and mobile agent technology. A new mobile agent system, called Vagabond, was implemented in pure Java using only standard CORBA facilities. The fundamental agent design and architecture is introduced. Measurements demonstrated the merits of Vagabond, namely the simple design, the implicit heterogeneity inherited from CORBA, and its speed. The system (renamed as M/sup 3/) was implanted inside an Oracle8i database system which is able to run Java code as a stored procedure. Further measurements have justified the idea presented above, ie. sending agents directly inside the database can decrease the response time of multimedia content search and retrieval. However, the required modifications made the embedded agency accessible for clients using only Aurora, Oracle's modified Visi-broker ORB. On the basis of the Proxy design pattern, the paper presents a proxy solution that encapsulates the specific protocol issues that restricted interoperability, and thus provides the user of the infrastructure with the benefits of a truly heterogeneous environment.","PeriodicalId":126071,"journal":{"name":"Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132636313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Integrating MPI and nanothreads programming model 集成MPI和纳米线程编程模型

Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing Pub Date : 2002-01-09 DOI: 10.1109/EMPDP.2002.994297

P. Hadjidoukas, E. D. Polychronopoulos, T. Papatheodorou

引用次数: 6

Programming distributed systems with Group_IO 使用Group_IO编程分布式系统

Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing Pub Date : 2002-01-09 DOI: 10.1109/EMPDP.2002.994266

F. Santana, Javier Miranda, J. M. Santos, Ernestina Martel, Luis Hernández, E. Pulido

引用次数: 3

An observation mechanism of distributed objects in Java Java中分布式对象的观察机制

Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing Pub Date : 2002-01-09 DOI: 10.1109/EMPDP.2002.994246

Amer Bouchi, R. Olejnik, B. Toursel

引用次数: 12