2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS)最新文献_第3页

CRUDE: Combining Resource Usage Data and Error Logs for Accurate Error Detection in Large-Scale Distributed Systems 原油:结合资源使用数据和错误日志进行大规模分布式系统的准确错误检测

2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS) Pub Date : 2016-09-01 DOI: 10.1109/SRDS.2016.017

Nentawe Gurumdimma, A. Jhumka, Maria Liakata, Edward Chuah, J. Browne

{"title":"CRUDE: Combining Resource Usage Data and Error Logs for Accurate Error Detection in Large-Scale Distributed Systems","authors":"Nentawe Gurumdimma, A. Jhumka, Maria Liakata, Edward Chuah, J. Browne","doi":"10.1109/SRDS.2016.017","DOIUrl":"https://doi.org/10.1109/SRDS.2016.017","url":null,"abstract":"The use of console logs for error detection in large scale distributed systems has proven to be useful to system administrators. However, such logs are typically redundant and incomplete, making accurate detection very difficult. In an attempt to increase this accuracy, we complement these incomplete console logs with resource usage data, which captures the resource utilisation of every job in the system. We then develop a novel error detection methodology, the CRUDE approach, that makes use of both the resource usage data and console logs. We thus make the following specific technical contributions: we develop (i) a clustering algorithm to group nodes with similar behaviour, (ii) an anomaly detection algorithm to identify jobs with anomalous resource usage, (iii) an algorithm that links jobs with anomalous resource usage with erroneous nodes. We then evaluate our approach using console logs and resource usage data from the Ranger Supercomputer. Our results are positive: (i) our approach detects errors with a true positive rate of about 80%, and (ii) when compared with the well-known Nodeinfo error detection algorithm, our algorithm provides an average improvement of around 85% over Nodeinfo, with a best-case improvement of 250%.","PeriodicalId":165721,"journal":{"name":"2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128384497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 29

Visualizing and Controlling VMI-Based Malware Analysis in IaaS Cloud 基于vmi的IaaS云恶意软件分析的可视化与控制

2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS) Pub Date : 2016-09-01 DOI: 10.1109/SRDS.2016.035

Noëlle Rakotondravony, Hans P. Reiser

{"title":"Visualizing and Controlling VMI-Based Malware Analysis in IaaS Cloud","authors":"Noëlle Rakotondravony, Hans P. Reiser","doi":"10.1109/SRDS.2016.035","DOIUrl":"https://doi.org/10.1109/SRDS.2016.035","url":null,"abstract":"Security in virtualized environment has known the support of different tools in the low-level detection and analysis of malware. The in-guest tracing mechanisms are now capable of operating at assembly language-, system call-, function call-and instruction-level to detect and classify malicious activities. Therefore, they are producing large amount of data about the state of a target system. However, the integrity of such data becomes questionable whenever the hosting target system is compromised. With virtual machine introspection (VMI), the monitoring tool runs outside the target monitored virtual machine (VM) [1]. Thus, the integrity of retrieved data is ensured even if the target system is compromised. Various works have brought VMI to Infrastructure-as-a-Service (Iaas) cloud environment, allowing the cloud user to run (simultaneous) forensics operations on his production VMs. The associated tracing mechanisms can collect larger amount of data in form of commented behavior traces or unstandardized log records. Thus, a human operator is needed to efficiently parse, represent, visualize and interpret the collected data, to benefit from their security relevance [2]. The use of visualization helps analysts investigate, compare and culster malware samples [3]. Existing visualization tools make use of recorded information to enhance the detection of intrusive behavior or the clustering of malware [4] from the observed system. However, at our knowledge, no existing tools establish a pre-to post-exploitation visualization graphs. We present an approach that enhances the detection and analysis of malware in the cloud by providing the cloud end-users the mean to efficiently visualize the different security relevant data collected through multiple VMI-based mechanisms.","PeriodicalId":165721,"journal":{"name":"2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134119233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Including Security Monitoring in Cloud Service Level Agreements 包括云服务级别协议中的安全监控

2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS) Pub Date : 2016-09-01 DOI: 10.1109/SRDS.2016.034

Amir Teshome, Louis Rilling, C. Morin

引用次数: 4

Lateral Movement Detection Using Distributed Data Fusion 使用分布式数据融合的横向运动检测

2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS) Pub Date : 2016-09-01 DOI: 10.1109/SRDS.2016.014

Ahmed M. Fawaz, Atul Bohara, C. Cheh, W. Sanders

{"title":"Lateral Movement Detection Using Distributed Data Fusion","authors":"Ahmed M. Fawaz, Atul Bohara, C. Cheh, W. Sanders","doi":"10.1109/SRDS.2016.014","DOIUrl":"https://doi.org/10.1109/SRDS.2016.014","url":null,"abstract":"Attackers often attempt to move laterally from host to host, infecting them until an overall goal is achieved. One possible defense against this strategy is to detect such coordinated and sequential actions by fusing data from multiple sources. In this paper, we propose a framework for distributed data fusion that specifies the communication architecture and data transformation functions. Then, we use this framework to specify an approach for lateral movement detection that uses host-level process communication graphs to infer network connection causations. The connection causations are then aggregated into system-wide host-communication graphs that expose possible lateral movement in the system. In order to provide a balance between the resource usage and the robustness of the fusion architecture, we propose a multilevel fusion hierarchy that uses different clustering techniques. We evaluate the scalability of the hierarchical fusion scheme in terms of storage overhead, number of message updates sent, fairness of resource sharing among clusters, and quality of local graphs. Finally, we implement a host-level monitor prototype to collect connection causations, and evaluate its overhead. The results show that our approach provides an effective method to detect lateral movement between hosts, and can be implemented with acceptable overhead.","PeriodicalId":165721,"journal":{"name":"2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116240212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 26

Norton Zone: Symantec's Secure Cloud Storage System 诺顿专区:赛门铁克安全云存储系统

2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS) Pub Date : 2016-09-01 DOI: 10.1109/SRDS.2016.020

Walter Bogorad, Scott Schneider, Haibin Zhang

引用次数: 0

Inside-Out: Reliable Performance Prediction for Distributed Storage Systems in the Cloud Inside-Out:云中分布式存储系统的可靠性能预测

2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS) Pub Date : 2016-09-01 DOI: 10.1109/SRDS.2016.025

Chin-Jung Hsu, R. Panta, Moo-Ryong Ra, V. Freeh

{"title":"Inside-Out: Reliable Performance Prediction for Distributed Storage Systems in the Cloud","authors":"Chin-Jung Hsu, R. Panta, Moo-Ryong Ra, V. Freeh","doi":"10.1109/SRDS.2016.025","DOIUrl":"https://doi.org/10.1109/SRDS.2016.025","url":null,"abstract":"Many storage systems are undergoing a significant shift from dedicated appliance-based model to software-defined storage (SDS) because the latter is flexible, scalable and cost-effective for modern workloads. However, it is challenging to provide a reliable guarantee of end-to-end performance in SDS due to complex software stack, time-varying workload and performance interference among tenants. Therefore, modeling and monitoring the performance of storage systems is critical for ensuring reliable QoS guarantees. Existing approaches such as performance benchmarking and analytical modeling are inadequate because they are not efficient in exploring large configuration space, and cannot support elastic operations and diverse storage services in SDS. This paper presents Inside-Out, an automatic model building tool that creates accurate performance models for distributed storage services. Inside-Out is a black-box approach. It builds high-level performance models by applying machine learning techniques to low-level system performance metrics collected from individual components of the distributed SDS system. Inside-Out uses a two-level learning method that combines two machine learning models to automatically filter irrelevant features, boost prediction accuracy and yield consistent prediction. Our in-depth evaluation shows that Inside-Out is a robust solution that enables SDS to predict end-to-end performance even in challenging conditions, e.g., changes in workload, storage configuration, available cloud resources, size of the distributed storage service, and amount of interference due to multi-tenants. Our experiments show that Inside-Out can predict end-to-end performance with 91.1% accuracy on average. Its prediction accuracy is consistent across diverse storage environments.","PeriodicalId":165721,"journal":{"name":"2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129099510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

VMBeam: Zero-Copy Migration of Virtual Machines for Virtual IaaS Clouds VMBeam:虚拟IaaS云的虚拟机零拷贝迁移

2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS) Pub Date : 2016-09-01 DOI: 10.1109/SRDS.2016.024

Kenichi Kourai, H. Ooba

引用次数: 2

Tale of Tails: Anomaly Avoidance in Data Centers 尾巴的故事:数据中心的异常避免

2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS) Pub Date : 2016-09-01 DOI: 10.1109/SRDS.2016.021

Ji Xue, R. Birke, L. Chen, E. Smirni

{"title":"Tale of Tails: Anomaly Avoidance in Data Centers","authors":"Ji Xue, R. Birke, L. Chen, E. Smirni","doi":"10.1109/SRDS.2016.021","DOIUrl":"https://doi.org/10.1109/SRDS.2016.021","url":null,"abstract":"It is a common practice that today's cloud data centers guard the performance by monitoring the resource usage, e.g., CPU and RAM, and issuing anomaly tickets whenever detecting usages exceeding predefined target values. Ensuring free of such usage anomaly can be extremely challenging, while catering to a large amount of virtual machines (VMs) showing bursty workloads on a limited amount of physical resource. Using resource usage data from production data centers that consist of more than 6K physical machines hosting more than 80K VMs, we identify statistic properties of anomaly instances (AIs) on physical servers, highlighting their burst duration and potential root causes. To strike a tradeoff between a strong performance guarantee and resource provisions, we propose a tail-driven anomaly avoidance policy for boxes, TailGuard, which allows a small fraction of AIs, e.g., 5% of usages can be above the target value, and still avoid severe performance degradation, typically caused by a burst of continuous AI. Specifically, TailGuard first introduces a novel usage tail prediction that explores the similarity patterns across a great number of boxes within a very recent history, and then redistributes the server load in an online fashion by proactive VM cloning and reactive load balancing. Evaluation results show that TailGuard can not only achieve an accuracy comparable with prediction methodology that relies on long history of usage data but also dramatically reduce the number of CPU AIs by 60%, with a tenfold reduction of their duration, from more than 25 time windows to only 2.","PeriodicalId":165721,"journal":{"name":"2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129191842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

On the Cost of Safe Storage for Public Clouds: An Experimental Evaluation 公共云的安全存储成本:一个实验评估

2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS) Pub Date : 2016-09-01 DOI: 10.1109/SRDS.2016.028

Dorian Burihabwa, Rogério Pontes, P. Felber, Francisco Maia, H. Mercier, R. Oliveira, J. Paulo, V. Schiavoni

{"title":"On the Cost of Safe Storage for Public Clouds: An Experimental Evaluation","authors":"Dorian Burihabwa, Rogério Pontes, P. Felber, Francisco Maia, H. Mercier, R. Oliveira, J. Paulo, V. Schiavoni","doi":"10.1109/SRDS.2016.028","DOIUrl":"https://doi.org/10.1109/SRDS.2016.028","url":null,"abstract":"Cloud-based storage services such as Dropbox, Google Drive and OneDrive are increasingly popular for storing enterprise data, and they have already become the de facto choice for cloud-based backup of hundreds of millions of regular users. Drawn by the wide range of services they provide, no upfront costs and 24/7 availability across all personal devices, customers are well-aware of the benefits that these solutions can bring. However, most users tend to forget—or worse ignore—some of the main drawbacks of such cloud-based services, namely in terms of privacy. Data entrusted to these providers can be leaked by hackers, disclosed upon request from a governmental agency's subpoena, or even accessed directly by the storage providers (e.g., for commercial benefits). While there exist solutions to prevent or alleviate these problems, they typically require direct intervention from the clients, like encrypting their data before storing it, and reduce the benefits provided such as easily sharing data between users. This practical experience report studies a wide range of security mechanisms that can be used atop standard cloud-based storage services. We present the details of our evaluation testbed and discuss the design choices that have driven its implementation. We evaluate several state-of-the-art techniques with varying security guarantees responding to user-assigned security and privacy criteria. Our results reveal the various trade-offs of the different techniques by means of representative workloads on top of industry-grade storage services.","PeriodicalId":165721,"journal":{"name":"2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116188631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Collaborative Stabilization 合作稳定

2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS) Pub Date : 2016-09-01 DOI: 10.1109/SRDS.2016.043

Mohammad Roohitavaf, S. Kulkarni

引用次数: 2