ACM Transactions on Storage最新文献

筛选
英文 中文
LVMT: An Efficient Authenticated Storage for Blockchain LVMT:区块链的高效认证存储
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2024-05-16 DOI: 10.1145/3664818
Chenxing Li, Sidi Mohamed Beillahi, Guang Yang, Ming Wu, Wei Xu, Fan Long
{"title":"LVMT: An Efficient Authenticated Storage for Blockchain","authors":"Chenxing Li, Sidi Mohamed Beillahi, Guang Yang, Ming Wu, Wei Xu, Fan Long","doi":"10.1145/3664818","DOIUrl":"https://doi.org/10.1145/3664818","url":null,"abstract":"<p>Authenticated storage access is the performance bottleneck of a blockchain, because each access can be amplified to potentially <i>O</i>(log <i>n</i>) disk I/O operations in the standard Merkle Patricia Trie (MPT) storage structure. In this paper, we propose a multi-Layer Versioned Multipoint Trie (LVMT), a novel high-performance blockchain storage with significantly reduced I/O amplifications. LVMT uses the authenticated multipoint evaluation tree (AMT) vector commitment protocol to update commitment proofs in constant time. LVMT adopts a multi-layer design to support unlimited key-value pairs and stores version numbers instead of value hashes to avoid costly elliptic curve multiplication operations. In our experiment, LVMT outperforms the MPT in real Ethereum traces, delivering read and write operations six times faster. It also boosts blockchain system execution throughput by up to 2.7 times.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":"8 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141059453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Memory-Disaggregated Radix Tree 内存分解的 Radix 树
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2024-05-08 DOI: 10.1145/3664289
Xuchuan Luo, Pengfei Zuo, Jiacheng Shen, Jiazhen Gu, Xin Wang, Michael Lyu, Yangfan Zhou
{"title":"A Memory-Disaggregated Radix Tree","authors":"Xuchuan Luo, Pengfei Zuo, Jiacheng Shen, Jiazhen Gu, Xin Wang, Michael Lyu, Yangfan Zhou","doi":"10.1145/3664289","DOIUrl":"https://doi.org/10.1145/3664289","url":null,"abstract":"<p>Disaggregated memory (DM) is an increasingly prevalent architecture with high resource utilization. It separates computing and memory resources into two pools and interconnects them with fast networks. Existing range indexes on DM are based on B+ trees, which suffer from large inherent read and write amplifications. The read and write amplifications rapidly saturate the network bandwidth, resulting in low request throughput and high access latency of B+ trees on DM. </p><p>In this paper, we propose that the radix tree is more suitable for DM than the B+ tree due to smaller read and write amplifications. However, constructing a radix tree on DM is challenging due to the costly lock-based concurrency control, the bounded memory-side IOPS, and the complicated computing-side cache validation. To address these challenges, we design <b>SMART</b>, the first radix tree for disaggregated memory with high performance. Specifically, we leverage 1) a <i>hybrid concurrency control</i> scheme including lock-free internal nodes and fine-grained lock-based leaf nodes to reduce lock overhead, 2) a computing-side <i>read-delegation and write-combining</i> technique to break through the IOPS upper bound by reducing redundant I/Os, and 3) a simple yet effective <i>reverse check</i> mechanism for computing-side cache validation. Experimental results show that SMART achieves 6.1 × higher throughput under typical write-intensive workloads and 2.8 × higher throughput under read-only workloads in YCSB benchmarks, compared with state-of-the-art B+ trees on DM.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":"41 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140928577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FSDedup: Feature-Aware and Selective Deduplication for Improving Performance of Encrypted Non-Volatile Main Memory FSDedup:提高加密非易失性主存储器性能的特征感知和选择性重复数据删除技术
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2024-05-01 DOI: 10.1145/3662736
Chunfeng Du, Zihang Lin, Suzhen Wu, Yifei Chen, Jiapeng Wu, Shengzhe Wang, Weichun Wang, Qingfeng Wu, Bo Mao
{"title":"FSDedup: Feature-Aware and Selective Deduplication for Improving Performance of Encrypted Non-Volatile Main Memory","authors":"Chunfeng Du, Zihang Lin, Suzhen Wu, Yifei Chen, Jiapeng Wu, Shengzhe Wang, Weichun Wang, Qingfeng Wu, Bo Mao","doi":"10.1145/3662736","DOIUrl":"https://doi.org/10.1145/3662736","url":null,"abstract":"<p>Enhancing the endurance, performance, and energy efficiency of encrypted Non-Volatile Main Memory (NVMM) can be achieved by minimizing written data through inline deduplication. However, existing approaches applying inline deduplication to encrypted NVMM suffer from substantial performance degradation due to high computing, memory footprint, and index-lookup overhead to generate, store, and query the cryptographic hash (fingerprint). In the preliminary ESD [14], we proposed the Error Correcting Code (ECC) assisted selective deduplication scheme, utilizing the ECC information as a fingerprint to identify similar data effectively and then leveraging the selective deduplication technique to eliminate a large amount of redundant data with high reference counts. In this paper, we proposed FSDedup. Compared with ESD, FSDedup could leverage the prefetch cache to reduce the read overhead during similarity comparison and utilize the cache refresh mechanism to identify further and eliminate more redundant data. Extensive experimental evaluations demonstrate that FSDedup can enhance the performance of the NVMM system further than the ESD. Experimental results show that FSDedup can improve both write and read speed by up to 1.8 ×, enhance Instructions Per Cycle (IPC) by up to 1.5 ×, and reduce energy consumption by up to 2.0 ×, compared to ESD.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":"26 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140840473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design and Implementation of Deduplication on F2FS F2FS 重复数据删除的设计与实施
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2024-04-29 DOI: 10.1145/3662735
Tiangmeng Zhang, Renhui Chen, Zijing Li, Congming Gao, Chengke Wang, Jiwu Shu
{"title":"Design and Implementation of Deduplication on F2FS","authors":"Tiangmeng Zhang, Renhui Chen, Zijing Li, Congming Gao, Chengke Wang, Jiwu Shu","doi":"10.1145/3662735","DOIUrl":"https://doi.org/10.1145/3662735","url":null,"abstract":"<p>Data deduplication technology has gained popularity in modern file systems due to its ability to eliminate redundant writes and improve storage space efficiency. In recent years, the flash-friendly file system (F2FS) has been widely adopted in flash memory based storage devices, including smartphones, fast-speed servers and Internet of Things. In this paper, we propose F2DFS (deduplication-based F2FS), which introduces three main design contributions. First, F2DFS integrates inline and offline hybrid deduplication. Inline deduplication eliminates redundant writes and enhances flash device endurance, while offline deduplication mitigates the negative I/O performance impact and saves more storage space. Second, F2DFS follows the file system coupling design principle, effectively leveraging the potentials and benefits of both deduplication and native F2FS. Also, with the aid of this principle, F2DFS achieves high-performance and space-efficient incremental deduplication. Third, F2DFS adopts virtual indexing to mitigate deduplication-induced many-to-one mapping updates during the segment cleaning. We conducted comprehensive experimental comparisons between F2DFS, native F2FS, and other state-of-the-art deduplication schemes, using both synthetic and real-world workloads. For inline deduplication, F2DFS outperforms SmartDedup, Dmdedup, and ZFS, in terms of both I/O bandwidth performance and deduplication rates. And for offline deduplication, compared to SmartDedup, XFS and BtrFS, F2DFS shows higher execution efficiency, lower resource usage and greater storage space savings. Moreover, F2DFS demonstrates more efficient segment cleanings than native F2FS.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":"39 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140812433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Index Shipping for Efficient Replication in LSM Key-Value Stores with Hybrid KV Placement 在混合 KV 放置的 LSM 键值存储中实现高效复制的索引运输
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2024-04-16 DOI: 10.1145/3658672
Giorgos Stilianakis, Giorgos Saloustros, Orestis Chiotakis, Giorgos Xanthakis, Angelos Bilas
{"title":"Index Shipping for Efficient Replication in LSM Key-Value Stores with Hybrid KV Placement","authors":"Giorgos Stilianakis, Giorgos Saloustros, Orestis Chiotakis, Giorgos Xanthakis, Angelos Bilas","doi":"10.1145/3658672","DOIUrl":"https://doi.org/10.1145/3658672","url":null,"abstract":"<p>Key-value (KV) stores based on LSM tree have become a foundational layer in the storage stack of datacenters and cloud services. Current approaches for achieving reliability and availability favor reducing network traffic and send to replicas only new KV pairs. As a result, they perform costly compactions to reorganize data in both the primary and backup nodes, which increases device I/O traffic and CPU overhead, and eventually hurts overall system performance. In this paper we describe <i>Tebis</i>, an efficient LSM-based KV store that reduces I/O amplification and CPU overhead for maintaining the replica index. We use a primary-backup replication scheme that performs compactions only on the primary nodes and sends pre-built indexes to backup nodes, avoiding all compactions in backup nodes. Our approach includes an efficient mechanism to deal with pointer translation across nodes in the pre-built region index. Our results show that <i>Tebis</i>\u0000reduces resource utilization on backup nodes compared to performing full compactions: Throughput is increased by 1.06 − 2.90 ×, CPU efficiency is increased by 1.21 − 2.78 ×, and I/O amplification is reduced by 1.7 − 3.27 ×, while network traffic increases by up to 1.32 − 3.76 ×.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":"6 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140588933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
eZNS: Elastic Zoned Namespace for Enhanced Performance Isolation and Device Utilization eZNS:增强性能隔离和设备利用率的弹性分区命名空间
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2024-04-12 DOI: 10.1145/3653716
Jaehong Min, Chenxingyu Zhao, Ming Liu, Arvind Krishnamurthy
{"title":"eZNS: Elastic Zoned Namespace for Enhanced Performance Isolation and Device Utilization","authors":"Jaehong Min, Chenxingyu Zhao, Ming Liu, Arvind Krishnamurthy","doi":"10.1145/3653716","DOIUrl":"https://doi.org/10.1145/3653716","url":null,"abstract":"<p>Emerging Zoned Namespace (ZNS) SSDs, providing the coarse-grained zone abstraction, hold the potential to significantly enhance the cost-efficiency of future storage infrastructure and mitigate performance unpredictability. However, existing ZNS SSDs have a static zoned interface, making them in-adaptable to workload runtime behavior, unscalable to underlying hardware capabilities, and interfering with co-located zones. Applications either under-provision the zone resources yielding unsatisfied throughput, create over-provisioned zones and incur costs, or experience unexpected I/O latencies. </p><p>We propose eZNS, an elastic-zoned namespace interface that exposes an adaptive zone with predictable characteristics. eZNS comprises two major components: a zone arbiter that manages zone allocation and active resources on the control plane, a hierarchical I/O scheduler with read congestion control, and write admission control on the data plane. Together, eZNS enables the transparent use of a ZNS SSD and closes the gap between application requirements and zone interface properties. Our evaluations over RocksDB demonstrate that eZNS outperforms a static zoned interface by 17.7% and 80.3% in throughput and tail latency, respectively, at most.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":"52 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140576562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Contract-Aware and Cost-effective LSM Store for Cloud Storage with Low Latency Spikes 面向低延迟峰值云存储的合约感知型低成本高效率 LSM 存储器
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2024-02-20 DOI: 10.1145/3643851
Yuanhui Zhou, Jian Zhou, Kai Lu, Ling Zhan, Peng Xu, Peng Wu, Shuning Chen, Xian Liu, Jiguang Wan
{"title":"A Contract-Aware and Cost-effective LSM Store for Cloud Storage with Low Latency Spikes","authors":"Yuanhui Zhou, Jian Zhou, Kai Lu, Ling Zhan, Peng Xu, Peng Wu, Shuning Chen, Xian Liu, Jiguang Wan","doi":"10.1145/3643851","DOIUrl":"https://doi.org/10.1145/3643851","url":null,"abstract":"<p>Cloud storage is gaining popularity because features such as pay-as-you-go significantly reduce storage costs. However, the community has not sufficiently explored its contract model and latency characteristics. As LSM-Tree-based key-value stores (LSM stores) become the building block for numerous cloud applications, how cloud storage would impact the performance of key-value accesses is vital. This study reveals the significant latency variances of Amazon Elastic Block Store (EBS) under various I/O pressures, which challenges LSM store read performance on cloud storage. To reduce the corresponding tail latency, we propose Calcspar, a contract-aware LSM store for cloud storage, which efficiently addresses the challenges by regulating the rate of I/O requests to cloud storage and absorbing surplus I/O requests with the data cache. We specifically developed a fluctuation-aware cache to lower the high latency brought on by workload fluctuations. Additionally, we build a congestion-aware IOPS allocator to reduce the impact of LSM store internal operations on read latency. We evaluated Calcspar on EBS with different real-world workloads and compared it to the cutting-edge LSM stores. The results show that Calcspar can significantly reduce tail latency while maintaining regular read and write performance, keeping the 99<sup>th</sup> percentile latency under 550<i>μ</i>s and reducing average latency by 66%. In addition, Calcspar has lower write prices and average latency compared to Cloud NoSQL services offered by cloud vendors.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":"36 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139956219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An End-to-End High-Performance Deduplication Scheme for Docker Registries and Docker Container Storage Systems 针对 Docker 注册表和 Docker 容器存储系统的端到端高性能重复数据删除方案
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2024-01-30 DOI: 10.1145/3643819
Nannan Zhao, Muhui Lin, Hadeel Albahar, Arnab K. Paul, Zhijie Huang, Subil Abraham, Keren Chen, Vasily Tarasov, Dimitrios Skourtis, Ali Anwar, Ali R. Butt
{"title":"An End-to-End High-Performance Deduplication Scheme for Docker Registries and Docker Container Storage Systems","authors":"Nannan Zhao, Muhui Lin, Hadeel Albahar, Arnab K. Paul, Zhijie Huang, Subil Abraham, Keren Chen, Vasily Tarasov, Dimitrios Skourtis, Ali Anwar, Ali R. Butt","doi":"10.1145/3643819","DOIUrl":"https://doi.org/10.1145/3643819","url":null,"abstract":"<p>The wide adoption of Docker containers for supporting agile and elastic enterprise applications has led to a broad proliferation of container images. The associated storage performance and capacity requirements place a high pressure on the infrastructure of <b>container registries</b> that store and distribute images and <b>container storage systems</b> on the Docker client side that manage image layers and store ephemeral data generated at container runtime. The storage demand is worsened by the large amount of duplicate data in images. Moreover, container storage systems that use Copy-on-Write (CoW) file systems as storage drivers exacerbate the redundancy. Exploiting the high file redundancy in real-world images is a promising approach to drastically reduce the growing storage requirements of container registries and improve the space efficiency of container storage systems. However, existing deduplication techniques significantly degrade the performance of both registries and container storage systems because of data reconstruction overhead as well as the deduplication cost. </p><p>We propose DupHunter, an end-to-end deduplication scheme that deduplicates layers for both Docker registries and container storage systems while maintaining a high image distribution speed and container I/O performance. DupHunter is divided into 3 tiers: registry tier, middle tier, and client tier. Specifically, we first build a high-performance deduplication engine at the registry tier that not only natively deduplicates layers for space savings but also reduces layer restore overhead. Then, we use deduplication offloading at the middle tier to eliminate the redundant files from the client tier and avoid bringing deduplication overhead to the clients. To further reduce the data duplicates caused by CoWs and improve the container I/O performance, we utilize a container-aware storage system at the client tier that reserves space for each container and arranges the placement of files and their modifications on the disk to preserve locality. Under real workloads, DupHunter reduces storage space by up to 6.9 × and reduces the <monospace>GET</monospace> layer latency up to 2.8 × compared to the state-of-the-art. Moreover, DupHunter can improve the container I/O performance by up to 93% for reads and 64% for writes.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":"94 18 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139646165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploiting Data-pattern-aware Vertical Partitioning to Achieve Fast and Low-cost Cloud Log Storage 利用数据模式感知垂直分区实现快速、低成本的云日志存储
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2024-01-29 DOI: 10.1145/3643641
Junyu Wei, Guangyan Zhang, Junchao Chen, Yang Wang, Weimin Zheng, Tingtao Sun, Jiesheng Wu, Jiangwei Jiang
{"title":"Exploiting Data-pattern-aware Vertical Partitioning to Achieve Fast and Low-cost Cloud Log Storage","authors":"Junyu Wei, Guangyan Zhang, Junchao Chen, Yang Wang, Weimin Zheng, Tingtao Sun, Jiesheng Wu, Jiangwei Jiang","doi":"10.1145/3643641","DOIUrl":"https://doi.org/10.1145/3643641","url":null,"abstract":"<p>Cloud logs can be categorized into on-line, off-line, and near-line logs based on the access frequency. Among them, near-line logs are mainly used for debugging, which means they prefer a low query latency for better user experience. Besides, the storage system for near-line logs prefers a low overall cost including the storage cost to store compressed logs, and the computation cost to compress logs and execute queries. These requirements pose challenges to achieving fast and cheap cloud log storage. </p><p>This paper proposes LogGrep, the first log compression and query tool that exploits both static and runtime patterns to properly structurize and organize log data in fine-grained units. The key idea of LogGrep is “vertical partitioning”: it stores each log entry into multiple partitions by first parsing logs into variable vectors according to static patterns and then extracting runtime pattern(s) automatically within each variable vector. Based on such runtime patterns, LogGrep further decomposes the variable vectors into fine-grained units called “Capsules” and stamps each Capsule with a summary of its values. During the query process, LogGrep can avoid decompressing and scanning Capsules that cannot match the keywords, with the help of the extracted runtime patterns and the Capsule stamps. We further show that the interactive debugging can well utilize the advantages of the vertical-partitioning-based method and mitigate its weaknesses as well. To this end, LogGrep integrates incremental locating and partial reconstruction to mitigate the read amplification incurred by vertical-partitioning-based method. </p><p>We evaluate LogGrep on 37 cloud logs from the production environment of Alibaba Cloud and the public datasets. The results show that LogGrep can reduce the query latency and the overall cost by an order of magnitude compared with state-of-the-art works. Such results have confirmed that it is worthwhile applying a more sophisticated vertical-partitioning-based method to accelerate queries on compressed cloud logs.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":"334 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139585234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bridging Software-Hardware for CXL Memory Disaggregation in Billion-Scale Nearest Neighbor Search 为十亿级近邻搜索中的 CXL 内存分解架设软硬件桥梁
IF 1.7 3区 计算机科学
ACM Transactions on Storage Pub Date : 2024-01-06 DOI: 10.1145/3639471
Junhyeok Jang, Hanjin Choi, Hanyeoreum Bae, Seungjun Lee, Miryeong Kwon, Myoungsoo Jung
{"title":"Bridging Software-Hardware for CXL Memory Disaggregation in Billion-Scale Nearest Neighbor Search","authors":"Junhyeok Jang, Hanjin Choi, Hanyeoreum Bae, Seungjun Lee, Miryeong Kwon, Myoungsoo Jung","doi":"10.1145/3639471","DOIUrl":"https://doi.org/10.1145/3639471","url":null,"abstract":"<p>We propose <i>CXL-ANNS</i>, a software-hardware collaborative approach to enable scalable approximate nearest neighbor search (ANNS) services. To this end, we first disaggregate DRAM from the host via compute express link (CXL) and place all essential datasets into its memory pool. While this CXL memory pool allows ANNS to handle billion-point graphs without an accuracy loss, we observe that the search performance significantly degrades because of CXL’s far-memory-like characteristics. To address this, CXL-ANNS considers the node-level relationship and caches the neighbors in local memory, which are expected to visit most frequently. For the uncached nodes, CXL-ANNS prefetches a set of nodes most likely to visit soon by understanding the graph traversing behaviors of ANNS. CXL-ANNS is also aware of the architectural structures of the CXL interconnect network and lets different hardware components collaborate with each other for the search. Further, it relaxes the execution dependency of neighbor search tasks and allows ANNS to utilize all hardware in the CXL network in parallel. </p><p>Our evaluation shows that CXL-ANNS exhibits 93.3% lower query latency than state-of-the-art ANNS platforms that we tested. CXL-ANNS also outperforms an oracle ANNS system that has unlimited local DRAM capacity by 68.0%, in terms of latency.</p>","PeriodicalId":49113,"journal":{"name":"ACM Transactions on Storage","volume":"10 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139372911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信