{"title":"C-LSM: Cooperative Log Structured Merge Trees","authors":"Natasha Mittal, Faisal Nawab","doi":"10.1145/3357223.3365443","DOIUrl":null,"url":null,"abstract":"The basic structure of the LSM[3] tree consists of four levels (we are considering only 4 levels), L0 in memory, and L1 to L3 in the disk. Compaction in L0/L1 is done through tiering, and compaction in the rest of the tree is done through leveling. Cooperative-LSM (C-LSM) is implemented by deconstructing the monolithic structure of LSM[3] trees to enhance the scalability of LSM trees by utilizing the resources of multiple machines in a more flexible way. The monolithic structure of LSM[3] tree lacks flexibility, and the only way to deal with an increased load on is to re-partition the data and distribute it across nodes. C-LSM comprises of three components - leader, compactor, and backup. Leader node receives write requests. It maintains Levels L0 and L1 of the LSM tree and performs minor compactions. Compactor maintains the rest of the levels (L2 and L3) and is responsible for compacting them. Backup maintains a copy of the entire LSM tree for fault-tolerance and read availability. The advantages that C-LSM provides are two-fold: • one can place components on different machines, and • one can have more than one instance of each component Running more than one instance for each component can enable various performance advantages: • Increasing the number of Leaders enables to digest data faster because the performance of a single machine no longer limits the system. • Increasing the number of Compactors enables to offload compaction[1] to more nodes and thus reduce the impact of compaction on other functions. • Increasing the number of backups increases read availability. Although, all these advantages can be achieved by re-partitioning the data and distributing the partitions across nodes, which most current LSM variants do. However, we hypothesize that partitioning is not feasible for all cases. For example, a dynamic workload where access patterns are unpredictable and no clear partitioning is feasible. In this case, the developer either has to endure the overhead of re-partitioning the data all the time or not be able to utilize the system resources efficiently if no re-partitioning is done. C-LSM enables scaling (and down-scaling) with less overhead compared to re-partitioning; if a partition is suddenly getting more requests, one can simply add a new component on another node. Each one of the components has different characteristics in terms of how it affects the workload and I/O. By having the flexibility to break down the components, one can find ways to distribute them in a way to increase overall efficiency. Having multiple instances of the three components leads to interesting challenges in terms of how to ensure that they work together without leading to any inconsistencies. We are trying to solve this through careful design of how these components interact and how to manage the decisions when failures or scaling events happen. Another interesting problem to solve is having multiple instances of C-LSM, each dedicated to one edge node or a cluster of edge nodes. For mobile-based or real-time data analysis applications, more and more data needs to be processed in edge nodes[2] itself and having a dedicated C-LSM will improve the overall latency. There are also some down-sides with more than one components that need to be addressed. For e.g., having more than one compaction server leads to the need for compaction across machines and/or redundancy of data, or having more than one leader needs to maintain a linearizable access.","PeriodicalId":91949,"journal":{"name":"Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference)","volume":"170 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3357223.3365443","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The basic structure of the LSM[3] tree consists of four levels (we are considering only 4 levels), L0 in memory, and L1 to L3 in the disk. Compaction in L0/L1 is done through tiering, and compaction in the rest of the tree is done through leveling. Cooperative-LSM (C-LSM) is implemented by deconstructing the monolithic structure of LSM[3] trees to enhance the scalability of LSM trees by utilizing the resources of multiple machines in a more flexible way. The monolithic structure of LSM[3] tree lacks flexibility, and the only way to deal with an increased load on is to re-partition the data and distribute it across nodes. C-LSM comprises of three components - leader, compactor, and backup. Leader node receives write requests. It maintains Levels L0 and L1 of the LSM tree and performs minor compactions. Compactor maintains the rest of the levels (L2 and L3) and is responsible for compacting them. Backup maintains a copy of the entire LSM tree for fault-tolerance and read availability. The advantages that C-LSM provides are two-fold: • one can place components on different machines, and • one can have more than one instance of each component Running more than one instance for each component can enable various performance advantages: • Increasing the number of Leaders enables to digest data faster because the performance of a single machine no longer limits the system. • Increasing the number of Compactors enables to offload compaction[1] to more nodes and thus reduce the impact of compaction on other functions. • Increasing the number of backups increases read availability. Although, all these advantages can be achieved by re-partitioning the data and distributing the partitions across nodes, which most current LSM variants do. However, we hypothesize that partitioning is not feasible for all cases. For example, a dynamic workload where access patterns are unpredictable and no clear partitioning is feasible. In this case, the developer either has to endure the overhead of re-partitioning the data all the time or not be able to utilize the system resources efficiently if no re-partitioning is done. C-LSM enables scaling (and down-scaling) with less overhead compared to re-partitioning; if a partition is suddenly getting more requests, one can simply add a new component on another node. Each one of the components has different characteristics in terms of how it affects the workload and I/O. By having the flexibility to break down the components, one can find ways to distribute them in a way to increase overall efficiency. Having multiple instances of the three components leads to interesting challenges in terms of how to ensure that they work together without leading to any inconsistencies. We are trying to solve this through careful design of how these components interact and how to manage the decisions when failures or scaling events happen. Another interesting problem to solve is having multiple instances of C-LSM, each dedicated to one edge node or a cluster of edge nodes. For mobile-based or real-time data analysis applications, more and more data needs to be processed in edge nodes[2] itself and having a dedicated C-LSM will improve the overall latency. There are also some down-sides with more than one components that need to be addressed. For e.g., having more than one compaction server leads to the need for compaction across machines and/or redundancy of data, or having more than one leader needs to maintain a linearizable access.