2011 19th International Euromicro Conference on Parallel, Distributed and Network-Based Processing最新文献

筛选
英文 中文
Balancing Workloads of Servers Maintaining Scalable Distributed Data Structures 平衡服务器负载,维护可扩展的分布式数据结构
Grzegorz Lukawski, K. Sapiecha
{"title":"Balancing Workloads of Servers Maintaining Scalable Distributed Data Structures","authors":"Grzegorz Lukawski, K. Sapiecha","doi":"10.1109/PDP.2011.13","DOIUrl":"https://doi.org/10.1109/PDP.2011.13","url":null,"abstract":"A new architecture of Scalable Distributed Data Structures (SDDS) is presented and evaluated. It applies forSDDS ?les with overactive servers. Every bucket of the ?leis supplemented with a reference counter. The number of references to a bucket is counted up. It re?ects activity of the bucket and is used for selecting the most active and most often used buckets (overactive servers). Workloads of the servers are then balanced with the help of so called scalability of throughput. It is proven that this gives very good results for read-mostly databases, where extensive pattern matching takes place.","PeriodicalId":341803,"journal":{"name":"2011 19th International Euromicro Conference on Parallel, Distributed and Network-Based Processing","volume":"217 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116160479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Converging Quickly to Independent Uniform Random Topologies 快速收敛到独立一致随机拓扑
Anne-Marie Kermarrec, V. Leroy, Christopher Thraves
{"title":"Converging Quickly to Independent Uniform Random Topologies","authors":"Anne-Marie Kermarrec, V. Leroy, Christopher Thraves","doi":"10.1109/PDP.2011.60","DOIUrl":"https://doi.org/10.1109/PDP.2011.60","url":null,"abstract":"The peer sampling service is a core building block for gossip protocols in peer-to-peer networks. Ideally, a peer sampling service continuously provides each peer with a sample of peers picked uniformly at random in the network. While empirical studies have shown that uniformity was achieved, analysis proposed so far assume strong restrictions on the topology of the overlay network it continuously generates. In this work, we analyze a Generic Random Peer Sampling Service (GRPS) that satisfies the desirable properties for any peer sampling service–small views, uniform sample, load balancing, and independence– and relieve strong degree connections in the nodes assumed in previous works. The main result we prove is: starting from any simple (without loops and parallel edges) directed graph with out-degree equal to c for all nodes, and recursively applying GRPS, eventually results in a random simple directed graph with out-degree equal to c for all nodes. We test empirically convergence time and independence time for GRPS. Finally, We use this empirical evaluation to show that GRPS performs better than previously presented peer sampling services.","PeriodicalId":341803,"journal":{"name":"2011 19th International Euromicro Conference on Parallel, Distributed and Network-Based Processing","volume":"163 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124581337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Dense Dynamic Programming on Multi GPU 基于多GPU的密集动态规划
V. Boyer, D. E. Baz, M. Elkihel
{"title":"Dense Dynamic Programming on Multi GPU","authors":"V. Boyer, D. E. Baz, M. Elkihel","doi":"10.1109/PDP.2011.25","DOIUrl":"https://doi.org/10.1109/PDP.2011.25","url":null,"abstract":"The implementation via CUDA of a hybrid dense dynamic programming method for knapsack problems on amulti-GPU architecture is considered. Tests are carried out on a Bull cluster with Tesla S1070 computing systems. A first series of computational results shows substantial speedup. The speedup factor is close to 28 with two GPUs.","PeriodicalId":341803,"journal":{"name":"2011 19th International Euromicro Conference on Parallel, Distributed and Network-Based Processing","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125165216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
PETransWS: Web Service Computing Platform for Logistics and Transportation PETransWS:物流与运输网络服务计算平台
F. Almeida, Vicente Blanco Pérez, J. Brito, Andres Crespo, J. Moreno-Pérez, Adrián Santos
{"title":"PETransWS: Web Service Computing Platform for Logistics and Transportation","authors":"F. Almeida, Vicente Blanco Pérez, J. Brito, Andres Crespo, J. Moreno-Pérez, Adrián Santos","doi":"10.1109/PDP.2011.86","DOIUrl":"https://doi.org/10.1109/PDP.2011.86","url":null,"abstract":"In large organizations and small firms in transportation, there is a growing need to use and analyze spatial data. Transportation system analysis and planning as well as mobility studies frequently use Geographic Information Systems (GIS). In this paper we propose the development of a web services platform dedicated to transportation and logistics. Taking advantage of the web services development framework PyOpenCF we integrate in the same environment services oriented to geolocalization, logistical optimization, etc. We develop a PyOpenCF client whose graphic interface allows processes with spatial data to be launched and provides a visualization of the results.","PeriodicalId":341803,"journal":{"name":"2011 19th International Euromicro Conference on Parallel, Distributed and Network-Based Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130305879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamic Load Balancing for High-Performance Simulations of Combustion in Engine Applications 发动机应用中高性能燃烧模拟的动态负载平衡
L. Antonelli, P. D'Ambra
{"title":"Dynamic Load Balancing for High-Performance Simulations of Combustion in Engine Applications","authors":"L. Antonelli, P. D'Ambra","doi":"10.1109/PDP.2011.88","DOIUrl":"https://doi.org/10.1109/PDP.2011.88","url":null,"abstract":"The chemical task in internal combustion engine simulations concerns with the solution of a non-linear stiff system of Ordinary Differential Equations (ODEs) per each cell of a discretization grid representing engine geometry. The computational cost of the above task, when a detailed kinetic scheme is used, is dominating in engine simulations. Due to local physical-chemical conditions, each system of ODEs is characterized by local numerical properties (such as stiffness), therefore local adaptive solvers are usually employed for its efficient solution. We developed an MPI-based combustion parallel solver for efficient solution of the chemical task in engine simulations within parallel environment. In this context, we propose a cell distribution based on a dynamic load balancing algorithm, using a strategy which preserves contiguousness of the computational grid cells. Efficiency of our approach is shown for parallel simulations of realistic Diesel engines, when different sizes of the discretization grid and different operative conditions of the engine are used.","PeriodicalId":341803,"journal":{"name":"2011 19th International Euromicro Conference on Parallel, Distributed and Network-Based Processing","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114736919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Failure Handling Framework for Distributed Data Mining Services on the Grid 网格上分布式数据挖掘服务的故障处理框架
Eugenio Cesario, D. Talia
{"title":"A Failure Handling Framework for Distributed Data Mining Services on the Grid","authors":"Eugenio Cesario, D. Talia","doi":"10.1109/PDP.2011.50","DOIUrl":"https://doi.org/10.1109/PDP.2011.50","url":null,"abstract":"Fault tolerance is an important issue in Grid computing, where many and heterogenous machines are used. In this paper we present a flexible failure handling framework which extends a service-oriented architecture for Distributed Data Mining previously proposed, addressing the requirements for fault tolerance in the Grid. The framework allows users to achieve failure recovery whenever a crash can occur on a Grid node involved in the computation. The implemented framework has been evaluated on a real Grid setting to assess its effectiveness and performance.","PeriodicalId":341803,"journal":{"name":"2011 19th International Euromicro Conference on Parallel, Distributed and Network-Based Processing","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126277402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Modeling Unconnectable Peers in Private BitTorrent Communities 在私有BitTorrent社区中建模不可连接的同伴
Kornel Csernai, Márk Jelasity, J. Pouwelse, T. Vinkó
{"title":"Modeling Unconnectable Peers in Private BitTorrent Communities","authors":"Kornel Csernai, Márk Jelasity, J. Pouwelse, T. Vinkó","doi":"10.1109/PDP.2011.21","DOIUrl":"https://doi.org/10.1109/PDP.2011.21","url":null,"abstract":"In a typical BitTorrent swarm, a large proportion of the peers are behind firewalls or NATs. These peers are called unconnectable. When developing P2P applications, a main requirement is to handle unconnectable peers appropriately. One important aspect of this problem, which has not been emphasized so far, is understanding the difference between the attributes of unconnectable peers and peers in the open Internet. For example, if unconnectable peers spend much less time online, or if they download significantly more, exploiting these facts helps to optimize the implementation, and ignoring these facts can even lead to severe performance problems. Comparing open and unconnectable peers is not easy because most traces contain no information about connect ability. Here we study two large traces collected in two private BitTorrent communities: FileList.org and BitSoup.org, both of which contain the connect ability attribute. From these traces we extract several attributes of individual online sessions, swarms, and users. We compare the distributions of these attributes over unconnectable and open peers. We find that there are some potentially important differences, e.g., unconnectable users tend to have a lot more sessions, and they tend to spend slightly more time online. Some of our findings are in contradiction with previous results that were based on a different trace collection methodology.","PeriodicalId":341803,"journal":{"name":"2011 19th International Euromicro Conference on Parallel, Distributed and Network-Based Processing","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126536469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
MPI-PERF-SIM: Towards an Automatic Performance Prediction Tool of MPI Programs on Hierarchical Clusters MPI- perf - sim:层次集群上MPI程序的自动性能预测工具
Sami Achour, Meher Ammar, Boubaker Khmili, W. Nasri
{"title":"MPI-PERF-SIM: Towards an Automatic Performance Prediction Tool of MPI Programs on Hierarchical Clusters","authors":"Sami Achour, Meher Ammar, Boubaker Khmili, W. Nasri","doi":"10.1109/PDP.2011.49","DOIUrl":"https://doi.org/10.1109/PDP.2011.49","url":null,"abstract":"We present in this paper a framework for performance prediction of parallel programs on hierarchical clusters. This framework is mainly designed for the use by the switching functions in parallel adaptive applications. Indeed, the principal referred objectives by this framework are the accuracy of the prediction and the rapidity of the prediction process. To achieve these objectives, our framework is based on two principal steps, the first is at the installation moment of the parallel application, and the second is at runtime. In the first step, we profile two components which are sequential kernels of the program and network performances. In order to model accurately these two components we have developed new strategies of regression. In the second step, we use the generated models and the runtime variables to the completion time estimation via our fast simulator MPI-PERF-SIM. Our experimentations on the Grid'5000 platform demonstrate the interest of this approach that can be the basis of adaptivity for parallel numerical libraries on dedicated hierarchical platforms.","PeriodicalId":341803,"journal":{"name":"2011 19th International Euromicro Conference on Parallel, Distributed and Network-Based Processing","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131837739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
SSE Vectorized and GPU Implementations of Arakawa's Formula for Numerical Integration of Equations of Fluid Motion 流体运动方程数值积分的Arakawa公式的SSE矢量化和GPU实现
Evren Yurtesen, M. Ropo, M. Aspnäs, J. Westerholm
{"title":"SSE Vectorized and GPU Implementations of Arakawa's Formula for Numerical Integration of Equations of Fluid Motion","authors":"Evren Yurtesen, M. Ropo, M. Aspnäs, J. Westerholm","doi":"10.1109/PDP.2011.80","DOIUrl":"https://doi.org/10.1109/PDP.2011.80","url":null,"abstract":"The numerical method presented by Arakawa in 1966[3] implements a ?nite difference scheme of the Jacobian for the solution of the equation of motion for two-dimensional incompressible ?ows, which diminishes nonlinear computational instability and permits long-term numerical integrations. This paper presents an ef?cient implementation of Arakawa's formula using vectorized Streaming SIMD Extension (SSE) and Advanced Vector Extension (AVX) instructions. Additionally, we have improved the performance of memory access in the code. Performance measurements show that the vectorizedimplementation is close to two times more ef?cient compared to an implementation without SSE. The AVX version will in the near future further improve the vectorized performance with an estimated factor of up to 1.8. Finally we compare our results to an implementation on a general purpose graphics processor (GPGPU) and to auto-vectorization by two compilers.","PeriodicalId":341803,"journal":{"name":"2011 19th International Euromicro Conference on Parallel, Distributed and Network-Based Processing","volume":"7 11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131072576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Adaptive and Cost-Optimal Parallel Algorithm for the 0-1 Knapsack Problem 0-1背包问题的自适应最优并行算法
Kenli Li, Lingxiao Li, Teklay Tesfazghi, E. Sha
{"title":"Adaptive and Cost-Optimal Parallel Algorithm for the 0-1 Knapsack Problem","authors":"Kenli Li, Lingxiao Li, Teklay Tesfazghi, E. Sha","doi":"10.1109/PDP.2011.11","DOIUrl":"https://doi.org/10.1109/PDP.2011.11","url":null,"abstract":"The 0-1 knapsack problem is well known to be NP-complete problem. In the past two decades, much effort has been done in order to find techniques that could lead to algorithms with a reasonable running time. This paper proposes a new parallel algorithm for the 0-1 knapsack problem where the optimal merging algorithm is adopted. Based on an EREW PRAM machine with shared memory, the proposed algorithm utilizes O((2^(n/4))^(1-e)) processors, 0 le ε le 1, and O(2^(n/2)) memory to find a solution for the n-element 0-1 knapsack problem in time O((2^(n/4))(2^(n/4))^e). Thus the cost of the proposed parallel algorithm is O(2^(n/2)), which is both the lowest upper-bound time and without memory conflicts if only quantity of objects is considered in the complexity analysis for the 0-1 knapsack problem. Thus it is an improvement result over the past researches.","PeriodicalId":341803,"journal":{"name":"2011 19th International Euromicro Conference on Parallel, Distributed and Network-Based Processing","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127403403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信