{"title":"MMD: An Approach to Improve Reading Performance in Deduplication Systems","authors":"Chao Li, Shupeng Wang, Xiao-chun Yun, Xiaoyang Zhou, Guangjun Wu","doi":"10.1109/NAS.2014.21","DOIUrl":null,"url":null,"abstract":"The approach of data deduplication has been widely used in backup systems and primary storage such as virtual machine platform. However, the reading speed in those systems suffers due to chunk fragmentation in deduplication. So it has become an important problem to improve reading performance in deduplication systems. In this paper, firstly we propose a new storage method using multiple disks to boost reading performance, which is called MMD. MMD takes advantage of the multiple parallelized disks, each of which is used as independent logical device. Then we present a deduplication model based on MMD, which focuses on optimization of data layout on disks to improve reading speed. Two I/O scheduling algorithms in that model are discussed, which aim at assigning the containers in deduplication systems to appropriate disks. Experiments show that MMD can achieve an obvious reading performance improvement than RAID in deduplication systems.","PeriodicalId":186621,"journal":{"name":"2014 9th IEEE International Conference on Networking, Architecture, and Storage","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 9th IEEE International Conference on Networking, Architecture, and Storage","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NAS.2014.21","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
The approach of data deduplication has been widely used in backup systems and primary storage such as virtual machine platform. However, the reading speed in those systems suffers due to chunk fragmentation in deduplication. So it has become an important problem to improve reading performance in deduplication systems. In this paper, firstly we propose a new storage method using multiple disks to boost reading performance, which is called MMD. MMD takes advantage of the multiple parallelized disks, each of which is used as independent logical device. Then we present a deduplication model based on MMD, which focuses on optimization of data layout on disks to improve reading speed. Two I/O scheduling algorithms in that model are discussed, which aim at assigning the containers in deduplication systems to appropriate disks. Experiments show that MMD can achieve an obvious reading performance improvement than RAID in deduplication systems.