2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing最新文献_第3页

Design and Development of a Facebook Application to Raise Privacy Awareness 设计和开发一个Facebook应用程序，提高隐私意识

2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing Pub Date : 2015-03-04 DOI: 10.1109/PDP.2015.23

Gianpiero Costantino, D. Sgandurra

{"title":"Design and Development of a Facebook Application to Raise Privacy Awareness","authors":"Gianpiero Costantino, D. Sgandurra","doi":"10.1109/PDP.2015.23","DOIUrl":"https://doi.org/10.1109/PDP.2015.23","url":null,"abstract":"Everyday people upload a large number of private pictures on online social networks (OSNs). Users trust OSNs to keep their pictures private, e.g. by making them available to their social friends only. Unfortunately, OSN security controls are not always strong enough and malicious people may exploit these weaknesses to potentially see any user's private pictures. It might even possible to access private photos posted on an OSN without circumventing its security policies. In fact, users sometimes add to their social circles acquaintances, recently met people, which might not be completely trusted. Furthermore, they occasionally allow third-party applications to access their pictures. These conditions imply that, to keep their photos private, users must trust all the security controls implemented by OSNs and all of their social friends (and how they interact with third-party applications). Actually, there are some situations in which these assumptions are not met and some data that users believed to be private might also be accessed by unknown people. The goal of this paper is to raise awareness on the problem of privacy of online pictures and to have OSN users think more carefully about how they use third-party applications and how they choose their friends online. To this end, we discuss a use-case of a Facebook application, which we have developed, that exploits some weaknesses and users' assumptions to gather a huge amount of private pictures.","PeriodicalId":285111,"journal":{"name":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131433155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Practical Performance Model for Compute and Memory Bound GPU Kernels 计算和内存绑定GPU内核的实用性能模型

2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing Pub Date : 2015-03-04 DOI: 10.1109/PDP.2015.51

E. Konstantinidis, Y. Cotronis

{"title":"A Practical Performance Model for Compute and Memory Bound GPU Kernels","authors":"E. Konstantinidis, Y. Cotronis","doi":"10.1109/PDP.2015.51","DOIUrl":"https://doi.org/10.1109/PDP.2015.51","url":null,"abstract":"Performance prediction of GPU kernels is generally a tedious procedure with unpredictable results. In this paper, we provide a practical model for estimating performance of CUDA kernels on GPU hardware in an automated manner. First, we propose the quadrant-split model, an alternative of the roofline visual performance model, which provides insight on the performance limiting factors of multiple devices with different compute-memory bandwidth ratios with respect to a particular kernel. We elaborate on the compute-memory bound characteristic of kernels. In addition, a micro-benchmark program was developed exposing the peak compute and memory transfer performance using variable operation intensity. Experimental results of executions on different GPUs are presented. In the proposed performance prediction procedure, a set of kernel features is extracted through an automated profiling execution which records a set of significant kernel metrics. Additionally, a small set of device features for the target GPU is generated using micro-benchmarking and architecture specifications. In conjunction of kernel and device features we determine the performance limiting factor and we generate an estimation of the kernel's execution time. We performed experiments on DAXPY, DGEMM, FFT and stencil computation kernels using 4 GPUs and we showed an absolute error in predictions of 10.1% in the average case and 25.8% in the worst case.","PeriodicalId":285111,"journal":{"name":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","volume":"193 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116105242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 30

A Scheduling Strategy Based on Redundancy of Service Requests on IaaS Providers 基于IaaS服务请求冗余的调度策略

2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing Pub Date : 2015-03-04 DOI: 10.1109/PDP.2015.80

Cristiano C. A. Vieira, L. Bittencourt, E. Madeira

{"title":"A Scheduling Strategy Based on Redundancy of Service Requests on IaaS Providers","authors":"Cristiano C. A. Vieira, L. Bittencourt, E. Madeira","doi":"10.1109/PDP.2015.80","DOIUrl":"https://doi.org/10.1109/PDP.2015.80","url":null,"abstract":"The improvement in virtualisation technologies, as well as higher data transmission rates and the popularisation of the Internet contributed to strengthen the cloud computing. This paradigm attracts more and more users to consume, and providers to offer services through the pay-as-you-go model. The users can benefit from the clouds by receiving high availability, cost reduction, load balancing, and better fault tolerance when deploying cloud services across several cloud providers instead of using a single service provider. However, the variety of charging models, quality of service (QoS), and monetary costs of virtual machines, makes it difficult to the customer to choose the best resource charging models to deploy applications. In this paper we propose a redundancy strategy based on the tenancy level among requests in order to decrease the scheduling cost and maintain QoS levels along different IaaS charging models. The strategy is modelled as an integer linear program (ILP) to compute the lowest cost when utilising virtual machines. Simulations show that the proposed approach computes schedules with smaller costs than alternative approaches.","PeriodicalId":285111,"journal":{"name":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115523687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Application-Agnostic Framework for Improving the Energy Efficiency of Multiple HPC Subsystems 提高多个高性能计算子系统能效的应用无关框架

2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing Pub Date : 2015-03-04 DOI: 10.1109/PDP.2015.18

Ghislain Landry Tsafack Chetsa, L. Lefèvre, J. Pierson, P. Stolf, Georges Da Costa

{"title":"Application-Agnostic Framework for Improving the Energy Efficiency of Multiple HPC Subsystems","authors":"Ghislain Landry Tsafack Chetsa, L. Lefèvre, J. Pierson, P. Stolf, Georges Da Costa","doi":"10.1109/PDP.2015.18","DOIUrl":"https://doi.org/10.1109/PDP.2015.18","url":null,"abstract":"The subsystems that compose a HPC platform (e.g. CPU, memory, storage and network) are often designed and configured to deliver exceptional performance to a wide range of workloads. As a result, a large part of the power that these subsystems consume is dissipated as heat even when executing workloads that do not require maximum performance. Attempts to tackle this problem include technologies whereby operating systems and applications can reconfigure subsystems dynamically, such as by using DVFS for CPUs, LPI for network components, and variable disk spinning for HDDs. Most previous work has explored these technologies individually to optimise workload execution and reduce energy consumption. We propose a framework that performs on-line analysis of an HPC system in order to identify application execution patterns without a priori information of their workload. The framework takes advantage of reoccurring patterns to reconfigure multiple subsystems dynamically and reduce overall energy consumption. Performance evaluation was carried out on Grid'5000 considering both traditional HPC benchmarks and real-life applications.","PeriodicalId":285111,"journal":{"name":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124683500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Efficient Lock-Free Work-Stealing Iterators for Data-Parallel Collections 用于数据并行集合的高效无锁窃取工作迭代器

2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing Pub Date : 2015-03-04 DOI: 10.1109/PDP.2015.65

Aleksandar Prokopec, Dmitry Petrashko, Martin Odersky

引用次数: 18

On the Quality of Implementation of the C++11 Thread Support Library c++ 11线程支持库的实现质量研究

2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing Pub Date : 2015-03-04 DOI: 10.1109/PDP.2015.33

Peter Thoman, P. Gschwandtner, T. Fahringer

{"title":"On the Quality of Implementation of the C++11 Thread Support Library","authors":"Peter Thoman, P. Gschwandtner, T. Fahringer","doi":"10.1109/PDP.2015.33","DOIUrl":"https://doi.org/10.1109/PDP.2015.33","url":null,"abstract":"Providing standardized building blocks for task-parallel programs within a language and its standard library has several advantages over other solutions. Close integration with compilers and runtime systems allows for potentially higher performance and portability facilitates wide-spread use. In the recently ratified C++11 standard, language constructs have been added along with a memory model to provide the developer with such building blocks. They allow accessing task parallelism and synchronization in a flexible and standardized way, potentially removing the need for third-party solutions. Nevertheless, since parallelization aims at high performance, an examination of the quality of implementation of these standardized means is necessary to determine their suitability for replacing established solutions. To that end, we present INNCABS, a new cross-platform cross-library benchmark suite consisting of 14 benchmarks with varying task granularities and synchronization requirements. Based on these benchmarks, we demonstrate that the performance of C++11 parallelism constructs in the three most commonly employed C++ runtime libraries prevents their use as a full replacement for third-party solutions due to simplistic parallelism implementations and high synchronization overheads.","PeriodicalId":285111,"journal":{"name":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","volume":"175 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122994667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Cost-Efficient, Utility-Based Caching of Expensive Computations in the Cloud 成本效益高、基于实用程序的云端昂贵计算缓存

2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing Pub Date : 2015-03-04 DOI: 10.1109/PDP.2015.49

Benjamin Byholm, F. Jokhio, A. Ashraf, S. Lafond, J. Lilius, Ivan Porres

{"title":"Cost-Efficient, Utility-Based Caching of Expensive Computations in the Cloud","authors":"Benjamin Byholm, F. Jokhio, A. Ashraf, S. Lafond, J. Lilius, Ivan Porres","doi":"10.1109/PDP.2015.49","DOIUrl":"https://doi.org/10.1109/PDP.2015.49","url":null,"abstract":"We present a model and system for deciding on computing versus storage trade-offs in the Cloud using von Neumann-Morgenstern lotteries. We use the decision model in a video-on-demand system providing cost-efficient transcoding and storage of videos. Video transcoding is an expensive computational process that converts a video from one format to another. Video data are large enough to cause concern over rising storage costs. In the general case, our work is of interest when dealing with expensive computations that generate large results that can be cached for future use. Solving the decision problem entails solving two sub-problems: how long to store cached objects and how many requests we can expect for a particular object in that duration. We compare the proposed approach to always storing and to our previous approach over one year using discrete-event simulations. We observe a 72% cost reduction compared to always storing and a 13% reduction compared to our previous approach. This reduction in cost stems from the proposed approach storing fewer unpopular objects when it does not regard it as cost-efficient to do so.","PeriodicalId":285111,"journal":{"name":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128489768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

RaceChecker: Efficient Identification of Harmful Data Races RaceChecker:有效识别有害的数据争用

2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing Pub Date : 2015-03-04 DOI: 10.1109/PDP.2015.19

Kai Lu, Zhendong Wu, Xiaoping Wang, Chen Chen, Xu Zhou

{"title":"RaceChecker: Efficient Identification of Harmful Data Races","authors":"Kai Lu, Zhendong Wu, Xiaoping Wang, Chen Chen, Xu Zhou","doi":"10.1109/PDP.2015.19","DOIUrl":"https://doi.org/10.1109/PDP.2015.19","url":null,"abstract":"Data races hidden in concurrent programs have caused severe failures. To improve the reliability, many race detectors are proposed. However, most of the reported races are not harmful, which consumes manual effort to identify the harmful races. This paper proposes RaceChecker that can detect the potential races and identify the harmful races effectively and efficiently. Unlike previous detectors, RaceChecker combines happens-before relation and ad-hoc synchronization to prune the infeasible races so that fewer potential races are required to be verified. Before verification, RaceChecker groups the remaining potential races, guaranteeing the potential races in one group do not interfere with each other. Therefore, multiple potential races in one group can be verified together in one execution. To our knowledge, this is the first effective technique that groups the potential races to improve the efficiency. Unlike previous detectors that verify one potential race in one execution, RaceChecker dynamically controls thread scheduler to create real race conditions to verify multiple potential races in one execution, identifying the harmful races that cause program failures. We have implemented RaceChecker as a prototype tool and have experimented on a number of real-world concurrent programs. Results show that 66% of the potential races are infeasible and nearly 48% of the executions are reduced by the grouping strategy. The known harmful races are also identified effectively. By pruning and grouping, RaceChecker identifies the harmful races more efficiently. Comparing with RaceMob and RaceFuzzer, the time is reduced significantly, with an average of 45% and 81% respectively.","PeriodicalId":285111,"journal":{"name":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128718798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

Integrating Data-Intensive Computing Systems with Biological Data Analysis Frameworks 集成数据密集型计算系统与生物数据分析框架

2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing Pub Date : 2015-03-04 DOI: 10.1109/PDP.2015.106

Edvard Pedersen, I. Raknes, Martin Ernstsen, L. A. Bongo

引用次数: 6

A Weighted Fat-Tree Routing Algorithm for Efficient Load-Balancing in Infini Band Enterprise Clusters 基于加权胖树路由的Infini波段企业集群高效负载均衡算法

2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing Pub Date : 2015-03-04 DOI: 10.1109/PDP.2015.111

Feroz Zahid, Ernst Gunnar Gran, Bartosz Bogdanski, Bjørn Dag Johnsen, T. Skeie

{"title":"A Weighted Fat-Tree Routing Algorithm for Efficient Load-Balancing in Infini Band Enterprise Clusters","authors":"Feroz Zahid, Ernst Gunnar Gran, Bartosz Bogdanski, Bjørn Dag Johnsen, T. Skeie","doi":"10.1109/PDP.2015.111","DOIUrl":"https://doi.org/10.1109/PDP.2015.111","url":null,"abstract":"Infini Band (IB) has become a popular network interconnect for high performance computing (HPC) systems. Many of the large IB-based HPC systems use some variant of the fat-tree topology to take advantage of the useful properties fat-trees offer. The fat-tree routing algorithm is one of the most efficient deterministic routing algorithms for fat-tree topologies. The algorithm ensures that the number of routes assigned to each link are balanced across the fabric. However, one problem with its load-balancing technique is that it assumes uniform traffic distribution in the network. When routes towards nodes that mainly consume large amount of data are assigned to share links in the fabric while alternative links are underutilized, sub-optimal network throughput is obtained. Also, as the fat tree algorithm routes nodes according to the indexing order, the performance may differ for two systems cabled in the exact same way. In this paper, we propose wFatTree, a novel fat-tree routing algorithm, which considers node traffic characteristics to balance load across the network links more evenly, and with predictable network performance. Our experiments and simulations show an improvement of up to 60% in total network throughput on large fat-tree installations when using wFatTree routing. Furthermore, wFatTree can also be used to prioritize traffic flowing towards the critical nodes in the network.","PeriodicalId":285111,"journal":{"name":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","volume":"IM-31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126626464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10