2017 International Conference on High Performance Computing & Simulation (HPCS)最新文献_第3页

High Performance Analysis of Omics Data: Experiences at University Magna Graecia of Catanzaro 组学数据的高性能分析:卡坦萨罗麦格纳希腊大学的经验

2017 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2017-07-01 DOI: 10.1109/HPCS.2017.157

Giuseppe Agapito, P. Guzzi, M. Cannataro

引用次数: 0

Picos, A Hardware Task-Dependence Manager for Task-Based Dataflow Programming Models Picos，基于任务的数据流编程模型的硬件任务依赖管理器

2017 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2017-07-01 DOI: 10.1109/HPCS.2017.134

Xubin Tan, Jaume Bosch, Miquel Vidal Piñol, C. Álvarez, Daniel Jiménez-González, E. Ayguadé, M. Valero

{"title":"Picos, A Hardware Task-Dependence Manager for Task-Based Dataflow Programming Models","authors":"Xubin Tan, Jaume Bosch, Miquel Vidal Piñol, C. Álvarez, Daniel Jiménez-González, E. Ayguadé, M. Valero","doi":"10.1109/HPCS.2017.134","DOIUrl":"https://doi.org/10.1109/HPCS.2017.134","url":null,"abstract":"Task-based programming Task-based programming models such as OpenMP, Intel TBB and OmpSs are widely used to extract high level of parallelism of applications executed on multi-core and manycore platforms. These programming models allow applications to be expressed as a set of tasks with dependences to drive their execution at runtime. While managing these dependences for task with coarse granularity proves to be highly beneficial, it introduces noticeable overheads when targeting fine-grained tasks, diminishing the potential speedups or even introducing performance losses. To overcome this drawback, we propose a hardware/software co-design Picos that manages inter-task dependences efficiently. In this paper we describe the main ideas of our proposal and a prototype implementation. This prototype is integrated with a parallel task- based programming model and evaluated with real executions in Linux embedded system with two ARM Cortex-A9 and a FPGA. When compared with a software runtime, our solution results in more than 1.8x speedup and 40% of energy savings with only 2 threads.","PeriodicalId":115758,"journal":{"name":"2017 International Conference on High Performance Computing & Simulation (HPCS)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114362385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

The Parallel and Distributed Future of Data Series Mining 数据序列挖掘的并行和分布式未来

2017 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2017-07-01 DOI: 10.1109/HPCS.2017.155

Themis Palpanas

{"title":"The Parallel and Distributed Future of Data Series Mining","authors":"Themis Palpanas","doi":"10.1109/HPCS.2017.155","DOIUrl":"https://doi.org/10.1109/HPCS.2017.155","url":null,"abstract":"There is an increasingly pressing need, by several applications in diverse domains, for developing techniques able to index and mine very large collections of sequences, or data series. Examples of such applications come from biology, astronomy, entomology, the web, and other domains. It is not unusual for these applications to involve numbers of data series in the order of hundreds of millions to billions, which are often times not analyzed in their full detail due to their sheer size. In this work, we describe past efforts in designing techniques for indexing and mining truly massive collections of data series, based on indexing techniques for fast similarity search, an operation that lies at the core of many mining algorithms. We show that there are two bottlenecks in mining such massive datasets, namely, the time taken to build the index, and the time required to answer exactly similarity queries. In response to these challenges, we discuss novel techniques that adaptively create data series indexes, allowing users to correctly answer queries before the indexing task is finished. We also show how our methods allow mining on datasets that would otherwise be completely untenable, including the first published experiments using one billion data series. Moreover, we present our vision for the future in big sequence management and mining research: we argue that more efforts should concentrate on parallel (including modern hardware optimization opportunities) and distributed solutions, which have until now been largely unexploited.","PeriodicalId":115758,"journal":{"name":"2017 International Conference on High Performance Computing & Simulation (HPCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124381464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 18

Using Virtualisation for Reproducible Research and Code Portability 使用虚拟化进行可重复研究和代码可移植性

2017 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2017-07-01 DOI: 10.1109/HPCS.2017.139

Svetlana Sveshnikova, I. Gankevich

引用次数: 1

Modeling the Internet of Things: a simulation perspective 物联网建模:模拟视角

2017 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2017-07-01 DOI: 10.1109/HPCS.2017.13

Gabriele D’angelo, S. Ferretti, V. Ghini

引用次数: 22

Effect of Different Varactor Models on Antenna Tunability 不同变容器模型对天线可调性的影响

2017 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2017-07-01 DOI: 10.1109/HPCS.2017.50

M. Madi, K. Kabalan, M. Al‐Husseini

引用次数: 3

Learning Word Embeddings in Parallel by Alignment 通过对齐并行学习单词嵌入

2017 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2017-07-01 DOI: 10.1109/HPCS.2017.90

Sahil Zubair, M. Zubair

{"title":"Learning Word Embeddings in Parallel by Alignment","authors":"Sahil Zubair, M. Zubair","doi":"10.1109/HPCS.2017.90","DOIUrl":"https://doi.org/10.1109/HPCS.2017.90","url":null,"abstract":"Distributed representations have become the de facto standard by which many modern neural network architectures deal with natural language processing tasks. In particular, the word2vec algorithm introduced by Mikolov, et al. popularized the use of distributed representations by demonstrating that learned embeddings capture semantic relationships geometrically. Though word2vec addresses some of the scaling issues of earlier approaches, it can still take days to complete the training process for very large data sets. Recently, researchers have tried to address this by proposing parallel variants of the word2vec algorithm. Note that in these approaches, the data set is partitioned among multiple processors that asynchronously update a shared model. We propose a parallel approach for word2vec that is based on instantiating multiple models and working with their own data sets. Our scheme transfers the learning between different models at discrete intervals (synchronously). The frequency with which we transfer the learning between different models is much less compared to the frequency of asynchronous updates in existing approaches. In our approach, we treat each of our instantiated word2vec instances as independent models. This implies that off the shelf implementations of word2vec can be used in our parallel approach. The key feature of our algorithm is in how we transfer the parameters between different models that have been independently trained using distinct partitions of a large data set. For this we propose a computationally inexpensive alignment and merge step. We validate our algorithm on a publicly available dataset using an implementation of word2vec in Google's tensorflow software. We evaluate our algorithm by comparing its runtime with the runtime of the sequential algorithm for a given training loss. Our results show that our parallel algorithm is able to achieve efficiency up to 57%.","PeriodicalId":115758,"journal":{"name":"2017 International Conference on High Performance Computing & Simulation (HPCS)","volume":"os-30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127864012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Chunk-Wise Parallelization Based on Dynamic Performance Prediction on Heterogeneous Multicores 基于异构多核动态性能预测的块并行化

2017 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2017-07-01 DOI: 10.1109/HPCS.2017.28

A. Dab, Y. Slama

{"title":"Chunk-Wise Parallelization Based on Dynamic Performance Prediction on Heterogeneous Multicores","authors":"A. Dab, Y. Slama","doi":"10.1109/HPCS.2017.28","DOIUrl":"https://doi.org/10.1109/HPCS.2017.28","url":null,"abstract":"Multicore machines are becoming more and more common. Ideally, all applications benefit from these advances in computer architecture. A complex challenge in parallel computing is cores load balancing to minimize the overall execution time called Make span of the parallel program. As multicores may have different architectures, an effective mapping should support this unknown variation to avoid drawbacks on make span. In fact, mapping or static load balancing method may not be effective when the target state machine changes during program execution. Thread affinity has appeared as an important technique to improve the program performance and for better performance stability. In this context, we propose a predictive approach using iterations chunking at runtime allowing parallel code adaptation to processor's performance. Our approach is based on thread pinning and performance detection at execution time. From a parallel program, we define a set of loop nest iterations, forming what is called chunk, and we run it using a first mapping assuming homogeneous cores. Then, performance assessment would correct mapping by speculating the future core's state. The new mapping would be then applied to a new chunk for further evaluation and prediction. The process would stop when the program is fully executed or when judging that chunking is no longer effective.","PeriodicalId":115758,"journal":{"name":"2017 International Conference on High Performance Computing & Simulation (HPCS)","volume":"45 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115752318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Towards Efficient Algorithms for Compressed Sparse-Sparse Matrix Product 压缩稀疏-稀疏矩阵积的高效算法研究

2017 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2017-07-01 DOI: 10.1109/HPCS.2017.101

S. Ezouaoui, O. Hamdi-Larbi, Z. Mahjoub

引用次数: 3

Automatic Generation of Wireless Sensor Networks Scheduling 无线传感器网络调度的自动生成

2017 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2017-07-01 DOI: 10.1109/HPCS.2017.32

Anis Mezni, E. Dumitrescu, É. Niel, S. Ahmed

引用次数: 1