自适应线性哈希固态驱动器

2016 IEEE 32nd International Conference on Data Engineering (ICDE) Pub Date : 2016-05-16 DOI:10.1109/ICDE.2016.7498260

Chengcheng Yang, Peiquan Jin, Lihua Yue, Dezhi Zhang

{"title":"自适应线性哈希固态驱动器","authors":"Chengcheng Yang, Peiquan Jin, Lihua Yue, Dezhi Zhang","doi":"10.1109/ICDE.2016.7498260","DOIUrl":null,"url":null,"abstract":"Flash memory based solid state drives (SSDs) have emerged as a new alternative to replace magnetic disks due to their high performance and low power consumption. However, random writes on SSDs are much slower than SSD reads. Therefore, traditional index structures, which are designed based on the symmetrical I/O property of magnetic disks, cannot completely exert the high performance of SSDs. In this paper, we propose an SSD-optimized linear hashing index called Self-Adaptive Linear Hashing (SAL-Hashing) to reduce small random writes to SSDs that are caused by index operations. The contributions of our work are manifold. First, we propose to organize buckets into groups and sets to facilitate coarse-grained writes and lazy-split so as to avoid intermediate writes on the hash structure. A group consists of a fixed number of buckets and a set consists of a number of groups. Second, we attach a log region to each set, and amortize the cost of reads and writes by committing updates to the log region in batch. Third, in order to reduce search cost, each log region is equipped with Bloom filters to index update logs. We devise a cost-based online algorithm to adaptively merge the log region with the corresponding set when the set becomes search-intensive. Finally, in order to exploit the internal package-level parallelisms of SSDs, we apply coarse-grained writes for merging or split operations to achieve a high bandwidth. Our experimental results suggest that our proposal is self-adaptive according to the change of access patterns, and outperforms several competitors under various workloads on two commodity SSDs.","PeriodicalId":6883,"journal":{"name":"2016 IEEE 32nd International Conference on Data Engineering (ICDE)","volume":"2 1","pages":"433-444"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Self-Adaptive Linear Hashing for solid state drives\",\"authors\":\"Chengcheng Yang, Peiquan Jin, Lihua Yue, Dezhi Zhang\",\"doi\":\"10.1109/ICDE.2016.7498260\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Flash memory based solid state drives (SSDs) have emerged as a new alternative to replace magnetic disks due to their high performance and low power consumption. However, random writes on SSDs are much slower than SSD reads. Therefore, traditional index structures, which are designed based on the symmetrical I/O property of magnetic disks, cannot completely exert the high performance of SSDs. In this paper, we propose an SSD-optimized linear hashing index called Self-Adaptive Linear Hashing (SAL-Hashing) to reduce small random writes to SSDs that are caused by index operations. The contributions of our work are manifold. First, we propose to organize buckets into groups and sets to facilitate coarse-grained writes and lazy-split so as to avoid intermediate writes on the hash structure. A group consists of a fixed number of buckets and a set consists of a number of groups. Second, we attach a log region to each set, and amortize the cost of reads and writes by committing updates to the log region in batch. Third, in order to reduce search cost, each log region is equipped with Bloom filters to index update logs. We devise a cost-based online algorithm to adaptively merge the log region with the corresponding set when the set becomes search-intensive. Finally, in order to exploit the internal package-level parallelisms of SSDs, we apply coarse-grained writes for merging or split operations to achieve a high bandwidth. Our experimental results suggest that our proposal is self-adaptive according to the change of access patterns, and outperforms several competitors under various workloads on two commodity SSDs.\",\"PeriodicalId\":6883,\"journal\":{\"name\":\"2016 IEEE 32nd International Conference on Data Engineering (ICDE)\",\"volume\":\"2 1\",\"pages\":\"433-444\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-05-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE 32nd International Conference on Data Engineering (ICDE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE.2016.7498260\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 32nd International Conference on Data Engineering (ICDE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2016.7498260","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

基于闪存的固态硬盘(ssd)因其高性能和低功耗而成为取代磁盘的新选择。但是，SSD上的随机写入要比SSD上的读取慢得多。因此，传统的基于磁盘对称I/O特性设计的索引结构，并不能完全发挥ssd的高性能。在本文中，我们提出了一种ssd优化的线性哈希索引，称为自适应线性哈希(sal -哈希)，以减少由索引操作引起的对ssd的小随机写操作。我们工作的贡献是多方面的。首先，我们建议将桶组织成组和集合，以方便粗粒度的写入和延迟分割，从而避免对哈希结构进行中间写入。一个组由固定数量的桶组成，而一个集合由多个组组成。其次，我们为每个集合附加一个日志区域，并通过批量向日志区域提交更新来分摊读写成本。第三，为了降低搜索成本，每个日志区域都配备了Bloom过滤器来索引更新日志。我们设计了一种基于代价的在线算法，当集合成为搜索密集型时，自适应地将日志区域与相应的集合合并。最后，为了利用ssd的内部包级并行性，我们对合并或拆分操作应用粗粒度写，以实现高带宽。我们的实验结果表明，我们的提议是根据访问模式的变化自适应的，并且在两个商品ssd上的各种工作负载下优于几个竞争对手。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Self-Adaptive Linear Hashing for solid state drives

Flash memory based solid state drives (SSDs) have emerged as a new alternative to replace magnetic disks due to their high performance and low power consumption. However, random writes on SSDs are much slower than SSD reads. Therefore, traditional index structures, which are designed based on the symmetrical I/O property of magnetic disks, cannot completely exert the high performance of SSDs. In this paper, we propose an SSD-optimized linear hashing index called Self-Adaptive Linear Hashing (SAL-Hashing) to reduce small random writes to SSDs that are caused by index operations. The contributions of our work are manifold. First, we propose to organize buckets into groups and sets to facilitate coarse-grained writes and lazy-split so as to avoid intermediate writes on the hash structure. A group consists of a fixed number of buckets and a set consists of a number of groups. Second, we attach a log region to each set, and amortize the cost of reads and writes by committing updates to the log region in batch. Third, in order to reduce search cost, each log region is equipped with Bloom filters to index update logs. We devise a cost-based online algorithm to adaptively merge the log region with the corresponding set when the set becomes search-intensive. Finally, in order to exploit the internal package-level parallelisms of SSDs, we apply coarse-grained writes for merging or split operations to achieve a high bandwidth. Our experimental results suggest that our proposal is self-adaptive according to the change of access patterns, and outperforms several competitors under various workloads on two commodity SSDs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 IEEE 32nd International Conference on Data Engineering (ICDE)

自引率

0.00%

发文量