2016 IEEE International Congress on Big Data (BigData Congress)最新文献_第2页

Modeling the Location Selection of Mirror Servers in Content Delivery Networks 内容分发网络中镜像服务器位置选择的建模

2016 IEEE International Congress on Big Data (BigData Congress) Pub Date : 2016-06-01 DOI: 10.1109/BigDataCongress.2016.68

Peter Hillmann, Tobias Uhlig, G. Rodosek, O. Rose

{"title":"Modeling the Location Selection of Mirror Servers in Content Delivery Networks","authors":"Peter Hillmann, Tobias Uhlig, G. Rodosek, O. Rose","doi":"10.1109/BigDataCongress.2016.68","DOIUrl":"https://doi.org/10.1109/BigDataCongress.2016.68","url":null,"abstract":"For a provider of a Content Delivery Network (CDN), the location selection of mirror servers is a complex optimization problem. Generally, the objective is to place the nodes centralized such that all customers have convenient access to the service according to their demands. It is an instance of the k-center problem, which is proven to be NP-hard. Determining reasonable server locations directly influences run time effects and future service costs. We model, simulate, and optimize the properties of a content delivery network. Specifically, considering the server locations in a network infrastructure with prioritized customers and weighted connections. A simulation model for the servers is necessary to analyze the caching behavior in accordance to the targeted customer requests. We analyze the problem and compare different optimization strategies. For our simulation, we employ various realistic scenarios and evaluate several performance indicators. Our new optimization approach shows a significant improvement. The presented results are generally applicable to other domains with k-center problems, e.g., the placement of military bases, the planning and placement of facility locations, or data mining.","PeriodicalId":407471,"journal":{"name":"2016 IEEE International Congress on Big Data (BigData Congress)","volume":" 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113947138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Design and Implementation of a Multidimensional Data Retrieval Sorting Optimization Model 多维数据检索排序优化模型的设计与实现

2016 IEEE International Congress on Big Data (BigData Congress) Pub Date : 2016-06-01 DOI: 10.1109/BigDataCongress.2016.38

Danfeng Yan, Liying Zhang, Xuan Zhao

引用次数: 1

Open Source Big Data Analytics Frameworks Written in Scala 用Scala编写的开源大数据分析框架

2016 IEEE International Congress on Big Data (BigData Congress) Pub Date : 2016-06-01 DOI: 10.1109/BigDataCongress.2016.61

J. Miller, Casey N. Bowman, V. Harish, Shannon P. Quinn

引用次数: 16

Complex Quality of Service Lifecycle Assessment Methodology 复杂服务质量生命周期评估方法

2016 IEEE International Congress on Big Data (BigData Congress) Pub Date : 2016-06-01 DOI: 10.1109/BigDataCongress.2016.71

R. Maule

引用次数: 0

Privacy-Aware Big Data Warehouse Architecture 隐私敏感的大数据仓库架构

2016 IEEE International Congress on Big Data (BigData Congress) Pub Date : 2016-06-01 DOI: 10.1109/BigDataCongress.2016.53

Karthik Navuluri, R. Mukkamala, Aftab Ahmad

引用次数: 8

Identification as a Service: Large-Scale Cloud Service Discovery over the World Wide Web 识别即服务:万维网上的大规模云服务发现

2016 IEEE International Congress on Big Data (BigData Congress) Pub Date : 2016-06-01 DOI: 10.1109/BigDataCongress.2016.74

Abdullah Alfazi, Quan Z. Sheng, W. Zhang, Lina Yao, Talal H. Noor

引用次数: 6

Towards an Efficient Top-K Trajectory Similarity Query Processing Algorithm for Big Trajectory Data on GPGPUs 基于gpgpu的大轨迹数据Top-K轨迹相似度查询处理算法

2016 IEEE International Congress on Big Data (BigData Congress) Pub Date : 2016-06-01 DOI: 10.1109/BigDataCongress.2016.33

Eleazar Leal, L. Gruenwald, Jianting Zhang, Simin You

{"title":"Towards an Efficient Top-K Trajectory Similarity Query Processing Algorithm for Big Trajectory Data on GPGPUs","authors":"Eleazar Leal, L. Gruenwald, Jianting Zhang, Simin You","doi":"10.1109/BigDataCongress.2016.33","DOIUrl":"https://doi.org/10.1109/BigDataCongress.2016.33","url":null,"abstract":"Through the use of location-sensing devices, it has been possible to collect very large datasets of trajectories. These datasets make it possible to issue spatio-temporal queries with which users can gather information about the characteristics of the movements of objects, derive patterns from that information, and understand the objects themselves. Among such spatio-temporal queries that can be issued is the top-K trajectory similarity query. This query finds many applications, such as bird migration analysis in ecology and trajectory sharing in social networks. However, the large size of the trajectory query sets and databases poses significant computational challenges. In this work, we propose a parallel GPGPU algorithm Top-KaBT that is specifically designed to reduce the size of the candidate set generated while processing these queries, and in doing so strives to address these computational challenges. The experiments show that the state of the art top-K trajectory similarity query processing algorithm on GPGPUs, TKSimGPU, achieves a 6.44X speedup in query processing time when combined with our algorithm and a 13X speedup over a GPGPU algorithm that uses exhaustive search.","PeriodicalId":407471,"journal":{"name":"2016 IEEE International Congress on Big Data (BigData Congress)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125581834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Don't Fire Me, a Kernel Autoregressive Hybrid Model for Optimal Layoff Plan 最优裁员计划的核自回归混合模型——不要解雇我

2016 IEEE International Congress on Big Data (BigData Congress) Pub Date : 2016-06-01 DOI: 10.1109/BigDataCongress.2016.72

Zhiling Luo, Ying Li, Ruisheng Fu, Jianwei Yin

{"title":"Don't Fire Me, a Kernel Autoregressive Hybrid Model for Optimal Layoff Plan","authors":"Zhiling Luo, Ying Li, Ruisheng Fu, Jianwei Yin","doi":"10.1109/BigDataCongress.2016.72","DOIUrl":"https://doi.org/10.1109/BigDataCongress.2016.72","url":null,"abstract":"Job cutting occurs when a modern service enterprise reduces the employing labour cost by firing some staffs. Making an appropriate layoff plan is always quite difficult since a bad job cutting has a serious impact on not only the organization but also the business process executing efficiency. Therefore, in this paper, we address the problem of making an optimal layoff plan with the least influence on the executing of the business process. The key challenge is estimating the process throughput under a layoff plan. We overcome this challenge by two steps: regressing the activity throughput by the stuff number and inferring process throughput by the maximum flow or minimum cut algorithm on the Directed Acyclic Graph of process. In the regressing step, a kernel autoregressive hybrid model is proposed, whose MSE is 30% lower than SVM. After that, an augmenting path based algorithm is introduced to make an optimal layoff plan. To evaluate the accuracy of our model, we conduct an external experiment on a real dataset from the workflow system employed in the government of Hangzhou City in China, which results in 9750969 logs from 2050 activities and 16295 employees in two years.","PeriodicalId":407471,"journal":{"name":"2016 IEEE International Congress on Big Data (BigData Congress)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134328921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Geelytics: Enabling On-Demand Edge Analytics over Scoped Data Sources Geelytics:在限定数据源上实现按需边缘分析

2016 IEEE International Congress on Big Data (BigData Congress) Pub Date : 2016-06-01 DOI: 10.1109/BigDataCongress.2016.21

Bin Cheng, Apostolos Papageorgiou, M. Bauer

{"title":"Geelytics: Enabling On-Demand Edge Analytics over Scoped Data Sources","authors":"Bin Cheng, Apostolos Papageorgiou, M. Bauer","doi":"10.1109/BigDataCongress.2016.21","DOIUrl":"https://doi.org/10.1109/BigDataCongress.2016.21","url":null,"abstract":"Large-scale Internet of Things (IoT) systems typically consist of a large number of sensors and actuators distributed geographically in a physical environment. To react fast on real time situations, it is often required to bridge sensors and actuators via real-time stream processing close to IoT devices. Existing stream processing platforms like Apache Storm and S4 are designed for intensive stream processing in a cluster or in the Cloud, but they are unsuitable for large scale IoT systems in which processing tasks are expected to be triggered by actuators on-demand and then be allocated and performed in a Cloud-Edge environment. To fill this gap, we designed and implemented a new system called Geelytics, which can enable on-demand edge analytics over scoped data sources via IoT-friendly interfaces to sensors and actuators. This paper presents its design, implementation, interfaces, and core algorithms. Three example applications have been built to showcase the potential of Geelytics in enabling advanced IoT edge analytics. Our preliminary evaluation results demonstrate that we can reduce the bandwidth cost by 99% in a face detection example, achieve less than 10 milliseconds reacting latency and about 1.5 seconds startup latency in an outlier detection example, and also save 65% duplicated computation cost via sharing intermediate results in a data aggregation example.","PeriodicalId":407471,"journal":{"name":"2016 IEEE International Congress on Big Data (BigData Congress)","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121374805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 35

Infra: SLO Aware Elastic Auto-scaling in the Cloud for Cost Reduction 基础设施:基于SLO的云计算弹性自动扩展，以降低成本

2016 IEEE International Congress on Big Data (BigData Congress) Pub Date : 2016-06-01 DOI: 10.1109/BigDataCongress.2016.25

Subhajit Sidhanta, S. Mukhopadhyay

{"title":"Infra: SLO Aware Elastic Auto-scaling in the Cloud for Cost Reduction","authors":"Subhajit Sidhanta, S. Mukhopadhyay","doi":"10.1109/BigDataCongress.2016.25","DOIUrl":"https://doi.org/10.1109/BigDataCongress.2016.25","url":null,"abstract":"Enterprises often host applications and services on clusters of virtual machine instances provided by cloud service providers, like Amazon, Rackspace, Microsoft, etc. Users pay a cloud usage cost on the basis of the hourly usage [1] of virtual machine instances composing the cluster. A cluster composition refers to the number of virtual machine instances of each type (from a predefined list of types) comprising a cluster. We present Infra, a cloud provisioning framework that can predict an (ϵ, δ)-minimum cluster composition required to run a given application workload on a cloud under an SLO (i.e., Service Level Objective) deadline. This paper does not present a new approximation algorithm, instead we provide a tool that applies existing machine learning techniques to predict an (ϵ, δ)-minimum cluster composition. An (ϵ, δ)-minimum cluster composition specifies a cluster composition whose cost approximates that of the minimum cluster composition (i.e., the cluster composition that incurs the minimum cloud usage cost that must be incurred in executing a given application under an SLO deadline); the approximation bounds the error to a predefined threshold ϵ with a degree of confidence 100 * (1 - δ)%. The degree of confidence 100 * (1 - δ)% specifies that the probability of failure in achieving the error threshold ϵ for the above approximation is at most δ. For ϵ = 0.1 and δ = 0.02, we experimentally demonstrate that an (ϵ, δ)-minimum cluster composition predicted by Infra successfully approximates the minimum cluster composition, i.e., the accuracy of prediction of minimum cluster composition ranges from 93.1% to 97.99% (the error is bound by the error threshold of 0.1) with a 98% degree of confidence, since 100* (1 - δ) = 98%. Auto scaling refers to the process of automatically adding cloud instances to a cluster to adapt to an increase in application workload (increased request rate), and deleting instances from a cluster when there is a decrease in workload (reduced request rate). However, state-of-the-art auto scaling techniques have the following disadvantages: A) they require explicit policy definition for changing the cluster configuration and therefore lack the ability to automatically adapt a cluster with respect to changing workload, B) they do not compute the appropriate size of resources required, and therefore do not result in an “optimal” cluster composition. Infra provides an auto scaler that automatically adapts a cloud infrastructure to changing application workload, scaling the cluster up/down based on predictions from the Infra provisioning tool.","PeriodicalId":407471,"journal":{"name":"2016 IEEE International Congress on Big Data (BigData Congress)","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128435628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2