{"title":"Assessing Big Data SQL Frameworks for Analyzing Event Logs","authors":"Markku Hinkka, Teemu Lehto, Keijo Heljanko","doi":"10.1109/PDP.2016.26","DOIUrl":"https://doi.org/10.1109/PDP.2016.26","url":null,"abstract":"Performing Process Mining by analyzing event logs generated by various systems is a very computation and I/O intensive task. Distributed computing and Big Data processing frameworks make it possible to distribute all kinds of computation tasks to multiple computers instead of performing the whole task in a single computer. This paper assesses whether contemporary structured query language (SQL) supporting Big Data processing frameworks are mature enough to be efficiently used to distribute computation of two central Process Mining tasks to two dissimilar clusters of computers providing BPM as a service in the cloud. Tests are performed by using a novel automatic testing framework detailed in this paper and its supporting materials. As a result, an assessment is made on how well selected Big Data processing frameworks manage to process and to parallelize the analysis work required by Process Mining tasks.","PeriodicalId":192273,"journal":{"name":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130488550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elena Corni, L. Morganti, M. Morigi, R. Brancaccio, M. Bettuzzi, G. Levi, E. Peccenini, D. Cesini, A. Ferraro
{"title":"X-Ray Computed Tomography Applied to Objects of Cultural Heritage: Porting and Testing the Filtered Back-Projection Reconstruction Algorithm on Low Power Systems-on-Chip","authors":"Elena Corni, L. Morganti, M. Morigi, R. Brancaccio, M. Bettuzzi, G. Levi, E. Peccenini, D. Cesini, A. Ferraro","doi":"10.1109/PDP.2016.60","DOIUrl":"https://doi.org/10.1109/PDP.2016.60","url":null,"abstract":"The embedded and high-performance computing (HPC) sectors, that in the past were completely separated, are now somehow converging under the pressure of two driving forces: the release of less power consuming server processors and the increased performance of the new low power Systems-on-Chip (SoCs) developed to meet the requirements of the demanding mobile market. This convergence allows the porting to low power embedded architectures of applications that were originally confined to traditional HPC systems. In this paper, we present our experience of porting the Filtered Back-projection Algorithm to a low power, low cost system-on-chip, the NVIDIA Tegra K1, which is based on a quad core ARM CPU and on a NVIDIA Kepler GPU. This Filtered Back-projection Algorithm is heavily used in 3D Tomography reconstruction software. The porting has been done exploiting various programming languages (i.e. OpenMP, CUDA) and multiple versions of the application have been developed to exploit both the SoC CPU and GPU. The performances have been measured in terms of 2D slices (of a 3D volume) reconstructed per time unit and per energy unit. The results obtained with all the developed versions are reported and compared with those obtained on a typical x86 HPC node accelerated with a recent NVIDIA GPU. The best performances are achieved combining the OpenMP version and the CUDA version of the algorithm. In particular, we discovered that only three Jetson TK1 boards, equipped with Giga Ethernet interconnections, allow to reconstruct as many images per time unit as a traditional server, using one order of magnitude less energy. The results of this work can be applied for instance to the construction of an energy-efficient computing system of a portable tomographic apparatus.","PeriodicalId":192273,"journal":{"name":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124337584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring Parallel Implementations of the Bayesian Probabilistic Matrix Factorization","authors":"Imen Chakroun, Tom Haber, T. Aa, Thomas Kovac","doi":"10.1109/PDP.2016.48","DOIUrl":"https://doi.org/10.1109/PDP.2016.48","url":null,"abstract":"Using the matrix factorization technique in machine learning is very common mainly in areas like recommender systems. Despite its high prediction accuracy and its ability to avoid over-fitting of the data, the Bayesian Probabilistic Matrix Factorization algorithm (BPMF) has not been widely used because of the prohibitive cost. In this paper, we propose a comprehensive parallel implementation of the BPMF using Gibbs sampling on shared and distributed architectures. We also propose an insight of a GPU-based implementation of this algorithm.","PeriodicalId":192273,"journal":{"name":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122887072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Stochastic Thermal Control of a Multicore Real-Time System","authors":"M. Mohaqeqi, M. Kargahi, K. Fouladi","doi":"10.1109/PDP.2016.44","DOIUrl":"https://doi.org/10.1109/PDP.2016.44","url":null,"abstract":"This paper deals with thermal management of a multicore processor executing multiple stochastic real-time job streams. The main objective is to reduce the chip-wide temperature gradient to decelerate processor aging, and the subordinate goal is to decrease the hotspot temperature. A pair of active and passive cores is dedicated to each stream, which the active one services the corresponding real-time jobs. In order to reduce the chip-wide temperature gradient between cores, the active and passive cores of an individual stream are replaced at appropriate times through job migration. The thermal management of this system is a specific stochastic control problem. Regarding the inter-effects of core temperatures and the stochastic nature of the system, systematic achievement of the objective needs an appropriate method. The control theory of Markov jump linear system (MJLS) has been used to design the desired thermal controller and analytically study its stability. The efficacy of the proposed approach in terms of the thermal management objectives is investigated through simulation experiments.","PeriodicalId":192273,"journal":{"name":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131452178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Accelerating Dynamic Fault Tree Analysis Based on Stochastic Logic Utilizing GPGPUs","authors":"Elham Cheshmikhani, H. Zarandi","doi":"10.1109/PDP.2016.130","DOIUrl":"https://doi.org/10.1109/PDP.2016.130","url":null,"abstract":"This paper demonstrates on speeding up an accurate analysis of fault trees using stochastic logic through GPGPUs. Actually, probability models of dynamic gates and new accurate models for different combinations of cold spare gate e.g., two cold spare gates with a share spare and a cold spare gate with more than one spare inputs are developed in this paper. Experimental results show that on average, the proposed analysis method is 235 times faster than CPU simulation time. Moreover, proposing new stochastic models results accuracy and simplicity as additional advantages of the proposed method.","PeriodicalId":192273,"journal":{"name":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114654976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Impact of Memory-Level Parallelism on the Performance of GPU Coherence Protocols","authors":"F. Candel, S. Petit, J. Sahuquillo, J. Duato","doi":"10.1109/PDP.2016.67","DOIUrl":"https://doi.org/10.1109/PDP.2016.67","url":null,"abstract":"Graphics Processing Units (GPUs) are being implemented in heterogeneous CPU/GPU systems due their high efficiency when executing massively parallel applications. New challenges appear to deal with heterogenous coherence in these systems due to the huge amount (hundreds or thousands) of on-going memory requests of GPUs, which is limited by the Miss Status Holding Register (MSHR) file size associated to the L1 cache. This paper analyzes how the number of MSHRs i) affects to typical memory performance metrics and ii) impacts on the system performance under two recent GPU coherence protocols, called NMOESI and SI (Southern Islands), which introduce distinct coherence traffic. We find two key findings that can help improve the performance of coherence protocols. First, there is a strong correlation between system performance and memory subsystem latency regardless of the used protocol. Second, system performance varies with the number of supported cache misses, however, counterintuitively, supporting more cache misses does not always bring enhanced performance but it can turn into performance drops.","PeriodicalId":192273,"journal":{"name":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130169776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ran Zheng, Genmao Yu, Hai Jin, Xuanhua Shi, Qin Zhang
{"title":"Conch: A Cyclic MapReduce Model for Iterative Applications","authors":"Ran Zheng, Genmao Yu, Hai Jin, Xuanhua Shi, Qin Zhang","doi":"10.1109/PDP.2016.66","DOIUrl":"https://doi.org/10.1109/PDP.2016.66","url":null,"abstract":"MapReduce programming model is a popular model to simplify but speed up data parallel applications. However, it is not efficient for iterative applications because of its repeated data transmission with HDFS (Hadoop Distributed File System). Conch, a cyclic MapReduce model, is designed for efficient processing of iterative applications. In order to minimize network overhead, shared data is cached locally and a \"map-shuffle\" phase is presented with a combined transmission mechanism. Meanwhile, a prediction scheduler for iterative applications is brought out to achieve better data locality in terms of runtime information. The experiments show that Conch can support iterative applications transparently and efficiently. Compared with Hadoop and HaLoop in single-job environment, Conch can achieve 13%-17% improvements on K-Means and fuzzy C-Means. Especially in multi-job environment, 63.6% and 28.6% improvements can be obtained compared with Hadoop and HaLoop.","PeriodicalId":192273,"journal":{"name":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121378285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RGBCC: A New Congestion Control Mechanism for InfiniBand","authors":"Qian Liu, R. Russell","doi":"10.1109/PDP.2016.87","DOIUrl":"https://doi.org/10.1109/PDP.2016.87","url":null,"abstract":"The InfiniBand Congestion Control (IB CC) mechanism is able to reduce the negative congestion consequences in many situations. However, its effectiveness depends on a set of configurable parameters that must be adjusted by users to tackle congestion. If the parameters are not appropriately configured, IB CC could negatively impact the network performance and production jobs. Additionally, the IB CC mechanism is very sensitive to slight changes in its parameter settings and in traffic patterns. These difficulties in adjusting parameters prevent IB CC from being widely used today. In this paper we propose a new congestion control mechanism called Red and Green lights-Based Congestion Control (RGBCC). Simulation results have demonstrated that RGBCC is able to reduce the congestion consequences and can be dynamically adapted to various network topologies and traffic patterns without user intervention or parameter re-configuration. Furthermore, because RGBCC follows mostly the logic of the current IB CC mechanism, only minimal changes would be needed to the existing hardware/firmware.","PeriodicalId":192273,"journal":{"name":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127209187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DKPN: A Composite Dataflow/Kahn Process Networks Execution Model","authors":"P. Arras, D. Fuin, E. Jeannot, Samuel Thibault","doi":"10.1109/PDP.2016.34","DOIUrl":"https://doi.org/10.1109/PDP.2016.34","url":null,"abstract":"To address the high level of dynamism and variability in modern streaming applications (e.g. video decoding) as well as the difficulties in programming heterogeneous MPSoCs, we propose a novel execution model based upon both dataflow and Kahn process networks. This paper presents the semantics and properties of this hierarchical and parametric model, called DKPN. Parameters are classified and it is shown that hints can be derived to improve the execution. A scheduler framework and policies to back the model are also exposed. Experiments illustrate the benefits of our approach.","PeriodicalId":192273,"journal":{"name":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114360737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Simulating Search Protocols in Large-Scale Dynamic Networks","authors":"S. Margariti, V. Dimakopoulos","doi":"10.1109/PDP.2016.74","DOIUrl":"https://doi.org/10.1109/PDP.2016.74","url":null,"abstract":"Reproducing complex networks with features of real-life networks is exciting and challenging at the same time. Based on the popular Omnet++ discrete event simulator, we introduce Armonia, a framework for modeling massive networks and their dynamic interactions. It includes a collection of topology generators, a set of resource placement and replication modules, a component for specifying resource location strategies, while also offering support for exporting data in order to visualize or analyze with other appropriate tools. Our framework targets search protocols in large-scale dynamic networks. Here, we apply it to simulate various probabilistic flooding strategies, making a comparative study of their performance over different network topologies.","PeriodicalId":192273,"journal":{"name":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121697795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}