2018 International Conference on High Performance Computing & Simulation (HPCS)最新文献_第9页

GPU-Accelerated VoltDB: A Case for Indexed Nested Loop Join gpu加速的voldb:索引嵌套循环连接的一个案例

2018 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2018-07-01 DOI: 10.1109/HPCS.2018.00046

A. Nguyen, M. Edahiro, S. Kato

{"title":"GPU-Accelerated VoltDB: A Case for Indexed Nested Loop Join","authors":"A. Nguyen, M. Edahiro, S. Kato","doi":"10.1109/HPCS.2018.00046","DOIUrl":"https://doi.org/10.1109/HPCS.2018.00046","url":null,"abstract":"Graphics Processing Units (GPUs) are traditionally designed for gaming purposes. The new GPU hardware and new programming platforms for GPU applications have enabled GPUs to work as co-processors alongside Central Processing Units (CPUs) in order to speed up general purpose applications. In this paper, we focus on the design and implementation of the GPU-Accelerated indexed nested loop join (INLJ) for in-memory relational database management system (RDBMS). Previous studies have proposed novel approaches for using GPU to improve the performance of the relational INLJ, but they are only implemented on simulation systems. Their performance in current industry RDBMS still needs to be clarified. To this end, we implement the GPU-Accelerated INLJ algorithm and perform various experiments on that join in VoltDB, an inmemory commercial RDBMS. We also propose a method for handling skewed input data, which is a critical problem in the GPU INLJ. Our evaluations indicated that though the GPU-Accelerated INLJ is 2-14X faster than the default INLJ of VoltDB, the memory copy between the host and the GPU memory is the major factor that holds back the join's speedup rate.","PeriodicalId":308138,"journal":{"name":"2018 International Conference on High Performance Computing & Simulation (HPCS)","volume":"148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115562307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Enhancing Loosely Schema-aware Entity Resolution with User Interaction 通过用户交互增强松散模式感知实体解析

2018 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2018-07-01 DOI: 10.1109/HPCS.2018.00138

Giovanni Simonini, Luca Gagliardelli, Song Zhu, S. Bergamaschi

{"title":"Enhancing Loosely Schema-aware Entity Resolution with User Interaction","authors":"Giovanni Simonini, Luca Gagliardelli, Song Zhu, S. Bergamaschi","doi":"10.1109/HPCS.2018.00138","DOIUrl":"https://doi.org/10.1109/HPCS.2018.00138","url":null,"abstract":"Entity Resolution (ER) is a fundamental task of data integration: it identifies different representations (i.e., profiles) of the same real-world entity in databases. To compare all possible profile pairs through an ER algorithm has a quadratic complexity. Blocking is commonly employed to avoid that: profiles are grouped into blocks according to some features, and ER is performed only for entities of the same block. Yet, devising blocking criteria and ER algorithms for data with highly schema heterogeneity is a difficult and error-prone task calling for automatic methods and debugging tools. In our previous work, we presented Blast, an ER system that can scale practitioners' favorite Entity Resolution algorithms. In current version, Blast has been devised to take full advantage of parallel and distributed computation as well (running on top of Apache Spark). It implements the state-of-the-art unsupervised blocking method based on automatically extracted loose schema information. We build on top of blast a GUI (Graphic User Interface), which allows: (i) to visualize, understand, and (optionally) manually modify the loose schema information automatically extracted (i.e., injecting user's knowledge in the system); (ii) to retrieve resolved entities through a free-text search box, and to visualize the process that lead to that result (i.e., the provenance). Experimental results on real-world datasets show that these two functionalities can significantly enhance Entity Resolution results.","PeriodicalId":308138,"journal":{"name":"2018 International Conference on High Performance Computing & Simulation (HPCS)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123610530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Parallel Algorithms for Multidimensional Data Streams Analysis with Tensor Subspace Models 基于张量子空间模型的多维数据流分析并行算法

2018 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2018-07-01 DOI: 10.1109/HPCS.2018.00139

B. Cyganek

引用次数: 0

From Global Choreography to Efficient Distributed Implementation 从全局编排到高效的分布式实现

2018 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2018-07-01 DOI: 10.1109/HPCS.2018.00122

Rayan Hallal, Mohamad Jaber, Rasha Abdallah

引用次数: 4

Oil Thickness Estimation Using Single- and Dual- Frequency Maximum-Likelihood Approach 单频和双频最大似然法估计油层厚度

2018 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2018-07-01 DOI: 10.1109/HPCS.2018.00025

Bilal Hammoud, H. Ayad, M. Fadlallah, J. Jomaah, F. Ndagijimana, G. Faour

引用次数: 5

Parallel Simulation of Electrophoretic Deposition for Industrial Automotive Applications 工业汽车用电泳沉积的并行模拟

2018 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2018-07-01 DOI: 10.1109/HPCS.2018.00080

Kevin Verma, Luis Ayuso, R. Wille

{"title":"Parallel Simulation of Electrophoretic Deposition for Industrial Automotive Applications","authors":"Kevin Verma, Luis Ayuso, R. Wille","doi":"10.1109/HPCS.2018.00080","DOIUrl":"https://doi.org/10.1109/HPCS.2018.00080","url":null,"abstract":"Electrophoretic Deposition (EPD) coating is one of the key applications in automotive manufacturing. In the recent years, tools based on Computational Fluid Dynamics (CFD) have been utilized to simulate corresponding coating processes. However, the complex data used in this application frequently brings standard CFD applications to its limits. For that purpose, a CFD-based tool named ALSIM has been proposed, which employs a unique volumetric decomposition method that addresses these problems. However, certain characteristics of this methodology yield drawbacks for the typical process used in this application - resulting in large execution times. In this work, we present a parallel scheme for this application which addresses these short-comings. To this end, two layers of parallelism are introduced. Both are implemented by employing OpenMP, allowing for the execution on shared memory parallel architectures. Experimental evaluations confirm the scalability and efficiency of the proposed methods. The simulation time of a typical use case in the automotive industry could be reduced from almost 6 days to 13 hours when employing 16 processing cores.","PeriodicalId":308138,"journal":{"name":"2018 International Conference on High Performance Computing & Simulation (HPCS)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128467538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

How Improve Set Similarity Join Based on Prefix Approach in Distributed Environment 如何改进分布式环境下基于前缀的集相似度连接

2018 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2018-07-01 DOI: 10.1109/HPCS.2018.00136

Song Zhu, Luca Gagliardelli, Giovanni Simonini, D. Beneventano

{"title":"How Improve Set Similarity Join Based on Prefix Approach in Distributed Environment","authors":"Song Zhu, Luca Gagliardelli, Giovanni Simonini, D. Beneventano","doi":"10.1109/HPCS.2018.00136","DOIUrl":"https://doi.org/10.1109/HPCS.2018.00136","url":null,"abstract":"Set similarity join is an essential operation to find similar pairs of records in data integration and data analytics applications. To cope with the increasing scale of the data, several techniques have been proposed to perform set similarity join using distributed frameworks (e.g. MapReduce). In particular, it is publicly available a MapReduce implementation of the PPJoin, that was experimentally demonstrated as one of the best set similarity join algorithm. However, these techniques produce huge amounts of duplicates in order to perform a successful parallel processing. Moreover, these approaches do not guarantee the load balancing, which generates skewness problem and less scalability of these techniques. To address these problems, we propose a duplicate-free technique called TTJoin, that performs set similarity join efficiently by utilizing an innovative filter derived from the prefix filter. Moreover, we implemented TTJoin on Apache Spark, that is one of the most innovative distributed framework. Several experiments on real-world datasets demonstrate the effectiveness of proposed solution with respect to either traditional TTJoin MapReduce implementation.","PeriodicalId":308138,"journal":{"name":"2018 International Conference on High Performance Computing & Simulation (HPCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127017402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Experiments in Routing Vehicles for Municipal Services 市政服务车辆路线试验

2018 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2018-07-01 DOI: 10.1109/HPCS.2018.00156

Imran Mahmood, J. Zubairi, Sahar Idwan, Izzeddin Matar

引用次数: 3

Insights into Application-level Solutions towards Resilient MPI Applications 针对弹性MPI应用的应用级解决方案的见解

2018 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2018-07-01 DOI: 10.1109/HPCS.2018.00101

P. González, Nuria Losada, María J. Martín

引用次数: 0

Assessing the Use of Genetic Algorithms to Schedule Independent Tasks Under Power Constraints 评估在功率限制下使用遗传算法调度独立任务

2018 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2018-07-01 DOI: 10.1109/HPCS.2018.00052

A. Kassab, J. Nicod, L. Philippe, V. Rehn-Sonigo

{"title":"Assessing the Use of Genetic Algorithms to Schedule Independent Tasks Under Power Constraints","authors":"A. Kassab, J. Nicod, L. Philippe, V. Rehn-Sonigo","doi":"10.1109/HPCS.2018.00052","DOIUrl":"https://doi.org/10.1109/HPCS.2018.00052","url":null,"abstract":"Green data and computing centers, centers using renewable energy sources, can be a valid solution to the over growing energy consumption of data or computing centers and their corresponding carbon foot print. Powering these centers with energy solely provided by renewable energy sources is however a challenge because renewable sources (like solar panels and wind turbines) cannot guarantee a continuous feeding due to their intermittent energy production. The high computation demand of HPC applications requires high power levels to be provided from the power supply. On the other hand, one advantage is that unlike online applications, HPC applications can tolerate delaying the execution of some tasks. Since the users however want their results as early as possible, minimum makespan is usually the main objective when scheduling this kind of jobs. The optimization problem of scheduling a set of tasks under power constraints is however proven to be NP- Complete. Designing and assessing heuristics is hence the only way to propose efficient solutions. In this paper, we present genetic algorithms for scheduling sets of independent tasks in parallel, with the objective of minimizing the makespan under power availability constraints. Extensive simulations show that genetic algorithms can compute good schedules for this problem.","PeriodicalId":308138,"journal":{"name":"2018 International Conference on High Performance Computing & Simulation (HPCS)","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131953851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6