Jiwei Xu, Wen-bo Zhang, Shiyang Ye, Jun Wei, Tao Huang
{"title":"云环境下轻量级虚拟机映像重复数据删除备份方法","authors":"Jiwei Xu, Wen-bo Zhang, Shiyang Ye, Jun Wei, Tao Huang","doi":"10.1109/COMPSAC.2014.73","DOIUrl":null,"url":null,"abstract":"As most clouds are based on virtualization technology, more and more virtual machine images are created within data centers. Depending on the need of disaster recovery, the storage space used for backup would easily sprawl to a TB or PB level with the growth of images. Unfortunately, different images have a large amount of same data segments. Those duplicated data segments will lead to serious waste of storage resource. Although there is a lot of work focus on deduplication storage and could achieve a good result in removing duplicate copies, they are not very suitable for virtual machine image deduplication in a cloud environment. Because huge resource usage of deduplication operations could lead to serious performance interference to the hosting virtual machines. This paper propose a local deduplication method which can speed up the operation progress of virtual machine image deduplication and reduce the operation time. The method is based on an improved k-means clustering algorithm, which could classify the metadata of backup image to reduce the search space of index lookup and improve the index lookup performance. Experiments show that our approach is robust and effective. It can significantly reduce the performance interference to hosting virtual machine with an acceptable increase in disk space usage.","PeriodicalId":106871,"journal":{"name":"2014 IEEE 38th Annual Computer Software and Applications Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":"{\"title\":\"A Lightweight Virtual Machine Image Deduplication Backup Approach in Cloud Environment\",\"authors\":\"Jiwei Xu, Wen-bo Zhang, Shiyang Ye, Jun Wei, Tao Huang\",\"doi\":\"10.1109/COMPSAC.2014.73\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As most clouds are based on virtualization technology, more and more virtual machine images are created within data centers. Depending on the need of disaster recovery, the storage space used for backup would easily sprawl to a TB or PB level with the growth of images. Unfortunately, different images have a large amount of same data segments. Those duplicated data segments will lead to serious waste of storage resource. Although there is a lot of work focus on deduplication storage and could achieve a good result in removing duplicate copies, they are not very suitable for virtual machine image deduplication in a cloud environment. Because huge resource usage of deduplication operations could lead to serious performance interference to the hosting virtual machines. This paper propose a local deduplication method which can speed up the operation progress of virtual machine image deduplication and reduce the operation time. The method is based on an improved k-means clustering algorithm, which could classify the metadata of backup image to reduce the search space of index lookup and improve the index lookup performance. Experiments show that our approach is robust and effective. It can significantly reduce the performance interference to hosting virtual machine with an acceptable increase in disk space usage.\",\"PeriodicalId\":106871,\"journal\":{\"name\":\"2014 IEEE 38th Annual Computer Software and Applications Conference\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-07-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"21\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE 38th Annual Computer Software and Applications Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/COMPSAC.2014.73\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 38th Annual Computer Software and Applications Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COMPSAC.2014.73","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Lightweight Virtual Machine Image Deduplication Backup Approach in Cloud Environment
As most clouds are based on virtualization technology, more and more virtual machine images are created within data centers. Depending on the need of disaster recovery, the storage space used for backup would easily sprawl to a TB or PB level with the growth of images. Unfortunately, different images have a large amount of same data segments. Those duplicated data segments will lead to serious waste of storage resource. Although there is a lot of work focus on deduplication storage and could achieve a good result in removing duplicate copies, they are not very suitable for virtual machine image deduplication in a cloud environment. Because huge resource usage of deduplication operations could lead to serious performance interference to the hosting virtual machines. This paper propose a local deduplication method which can speed up the operation progress of virtual machine image deduplication and reduce the operation time. The method is based on an improved k-means clustering algorithm, which could classify the metadata of backup image to reduce the search space of index lookup and improve the index lookup performance. Experiments show that our approach is robust and effective. It can significantly reduce the performance interference to hosting virtual machine with an acceptable increase in disk space usage.