{"title":"CloudInsight: Shedding Light on the Cloud","authors":"A. Arefin, Guofei Jiang","doi":"10.1109/SRDS.2011.34","DOIUrl":null,"url":null,"abstract":"Cloud computing provides a revolutionary new computing paradigm for deploying enterprise applications and Internet services. Rather than operating their own data centers, today cloud users run their applications on the remote cloud infrastructures that are owned and managed by cloud providers. However, the cloud computing paradigm also introduces some new challenges in system management. Cloud users create virtual machine instances to run their specific application logic without knowing the underlying physical infrastructure. On the other side, cloud providers manage and operate their cloud infrastructures without knowing their customers' applications. Due to the decoupled ownership of applications and infrastructures, if a problem occurs, there is no visibility for either cloud users or providers to understand the whole context of the incident and solve it quickly. To this end, we propose a software solution, Cloud Insight, to provide some visibility through the middle virtualization layer for both cloud users and providers to address their problems quickly. Cloud Insight automatically tracks each VM instance's configuration status and maintains their life-cycle configuration records in a configuration management database (CMDB). When a user reports a problem, our algorithms automatically analyze CMDB to probabilistically determine the root cause and invoke a recovery process by interacting with the cloud user. Experimental results over data from Amazon EC2 online support forum and NEC Labs' research cloud infrastructures demonstrate that our approach can effectively automate the problem troubleshooting process in cloud environments.","PeriodicalId":116805,"journal":{"name":"2011 IEEE 30th International Symposium on Reliable Distributed Systems","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 30th International Symposium on Reliable Distributed Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SRDS.2011.34","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Cloud computing provides a revolutionary new computing paradigm for deploying enterprise applications and Internet services. Rather than operating their own data centers, today cloud users run their applications on the remote cloud infrastructures that are owned and managed by cloud providers. However, the cloud computing paradigm also introduces some new challenges in system management. Cloud users create virtual machine instances to run their specific application logic without knowing the underlying physical infrastructure. On the other side, cloud providers manage and operate their cloud infrastructures without knowing their customers' applications. Due to the decoupled ownership of applications and infrastructures, if a problem occurs, there is no visibility for either cloud users or providers to understand the whole context of the incident and solve it quickly. To this end, we propose a software solution, Cloud Insight, to provide some visibility through the middle virtualization layer for both cloud users and providers to address their problems quickly. Cloud Insight automatically tracks each VM instance's configuration status and maintains their life-cycle configuration records in a configuration management database (CMDB). When a user reports a problem, our algorithms automatically analyze CMDB to probabilistically determine the root cause and invoke a recovery process by interacting with the cloud user. Experimental results over data from Amazon EC2 online support forum and NEC Labs' research cloud infrastructures demonstrate that our approach can effectively automate the problem troubleshooting process in cloud environments.