I. Kuralenok, Artem Trofimov, Nikita Marshalkin, Boris Novikov
{"title":"FlameStream","authors":"I. Kuralenok, Artem Trofimov, Nikita Marshalkin, Boris Novikov","doi":"10.1145/3206333.3209273","DOIUrl":"https://doi.org/10.1145/3206333.3209273","url":null,"abstract":"Exactly-once semantics without high latency overhead is still hard to achieve within state-of-the-art stream processing systems. We introduce a model providing for exactly-once using lightweight optimistic approach for obtaining determinism and idempotence. We show its feasibility with a prototype.","PeriodicalId":253916,"journal":{"name":"Proceedings of the 5th ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128924463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MapRDD","authors":"Zhenyu Li, Stephen Jarvis","doi":"10.1145/3206333.3206335","DOIUrl":"https://doi.org/10.1145/3206333.3206335","url":null,"abstract":"The Resilient Distributed Dataset (RDD) is the core memory abstraction behind the popular data-analytic framework Apache Spark. We present an extension to the Resilient Distributed Dataset for map transformations, that we call MapRDD, which takes advantage of the underlying relations between records in the parent and child datasets, in order to achieve random-access of individual records in a partition. The design is complemented by a new MemoryStore, which manages data sampling and data transfers asynchronously. We use the ImageNet dataset to demonstrate that: (I) The initial data loading phase is redundant and can be completely avoided; (II) Sampling on the CPU can be entirely overlapped with training on the GPU to achieve near full occupancy; (III) CPU processing cycles and memory usage can be reduced by more than 90%, allowing other applications to be run simultaneously; (IV) Constant training step time can be achieved, regardless of the size of the partition, for up to 1.3 million records in our experiments. We expect to obtain the same improvements in other RDD transformations via further research on finer-grained implicit & explicit dataset relations.","PeriodicalId":253916,"journal":{"name":"Proceedings of the 5th ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129379205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bruhathi Sundarmurthy, Paraschos Koutris, J. Naughton
{"title":"Exploiting Data Partitioning To Provide Approximate Results","authors":"Bruhathi Sundarmurthy, Paraschos Koutris, J. Naughton","doi":"10.1145/3206333.3206337","DOIUrl":"https://doi.org/10.1145/3206333.3206337","url":null,"abstract":"Co-hash partitioning is a popular partitioning strategy in distributed query processing, where tables are co-located using join predicates. In this paper, we study the benefits of co-hash partitioning for obtaining approximate answers.","PeriodicalId":253916,"journal":{"name":"Proceedings of the 5th ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129593202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Six Pass MapReduce Implementation of Strassen's Algorithm for Matrix Multiplication","authors":"Prakash V. Ramanan","doi":"10.1145/3206333.3206336","DOIUrl":"https://doi.org/10.1145/3206333.3206336","url":null,"abstract":"Consider the multiplication of two n x n matrices. A straight-forward sequential algorithm for computing the product takes Θ(n3) time. Strassen [21] presented an algorithm that takes Θ(nlg 7) time; lg denotes logarithm to the base 2; lg 7 is about 2.81. Now, consider the implementation of these two algorithms (straightforward and Strassen) in the mapReduce framework. Several papers have studied mapReduce implementations of the straight-forward algorithm; this algorithm can be implemented using a constant number (typically, one or two) of mapReduce passes. In this paper, we study the mapReduce implementation of Strassen's algorithm. If we unwind the recursion, Strassen's algorithm consists of three parts, Parts I--III. Direct mapReduce implementations of the three parts take lg n, 1 and lg n passes, respectively; total number of passes is 2 lg n + 1. In a previous paper [7], we showed that Part I can be implemented in 2 passes, with total work Θ(nlg 7), and reducer size and reducer workload o(n). In this paper, we show that Part III can be implemented in three passes. So, overall, Strassen's algorithm can be implemented in six passes, with total work Θ(nlg 7), and reducer size and reducer workload o(n).","PeriodicalId":253916,"journal":{"name":"Proceedings of the 5th ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125423272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Distribution-Aware Stream Partitioning for Distributed Stream Processing Systems","authors":"Anil Pacaci, M. Tamer Özsu","doi":"10.1145/3206333.3206338","DOIUrl":"https://doi.org/10.1145/3206333.3206338","url":null,"abstract":"The performance of modern distributed stream processing systems is largely dependent on balanced distribution of the workload across cluster. Input streams with large, skewed domains pose challenges to these systems, especially for stateful applications. Key splitting, where state of a single key is partially maintained across multiple workers, is a simple yet effective technique to reduce load imbalance in such systems. However it comes with the cost of increased memory overhead which has been neglected by existing techniques so far. In this paper we present a novel stream partitioning algorithm for intra-operator parallelism which adapts to the underlying stream distribution in an online manner and provides near-optimal load imbalance with minimal memory overhead. Our technique relies on explicitly routing frequent items using a greedy heuristic which considers both load imbalance and space requirements. It uses hashing for in frequent items to keep the size of routing table small. Through extensive experimentation with real and synthetic datasets, we show that our proposed solution consistently provides near-optimal load imbalance and memory footprint over variety of distributions. Our experiments on Apache Storm show up to an order of magnitude increase in overall throughput and up to 80% space savings over state-of-the-art stream partitioning techniques.","PeriodicalId":253916,"journal":{"name":"Proceedings of the 5th ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond","volume":"74 274 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125964717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Latency-conscious dataflow reconfiguration","authors":"Moritz Hoffmann, Frank McSherry, Andrea Lattuada","doi":"10.1145/3206333.3206334","DOIUrl":"https://doi.org/10.1145/3206333.3206334","url":null,"abstract":"We propose a prototype incremental data migration mechanism for stateful distributed data-parallel dataflow engines with latency objectives. When compared to existing scaling mechanisms, our prototype has the following differentiating characteristics: (i) the mechanism provides tunable granularity for avoiding latency spikes, (ii) reconfigurations can be prepared ahead of time to avoid runtime coordination, and (iii) the implementation only relies on existing dataflow APIs and need not require system modifications. We demonstrate our proposal on example computations with varying amounts of state that needs to be migrated, which is a non-trivial task for systems like Dhalion and Flink. Our implementation, prototyped on Timely Dataflow, provides a scalable stateful operator template compatible with existing APIs that carefully reorganizes data to minimize migration overhead. Compared to naïve approaches we reduce service latencies by orders of magnitude.","PeriodicalId":253916,"journal":{"name":"Proceedings of the 5th ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123202767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V. Gottin, Edward Pacheco, Jonas Dias, A. Ciarlini, B. Costa, Wagner Vieira, Y. M. Souto, Paulo F. Pires, F. Porto, J. G. Rittmeyer
{"title":"Automatic Caching Decision for Scientific Dataflow Execution in Apache Spark","authors":"V. Gottin, Edward Pacheco, Jonas Dias, A. Ciarlini, B. Costa, Wagner Vieira, Y. M. Souto, Paulo F. Pires, F. Porto, J. G. Rittmeyer","doi":"10.1145/3206333.3206339","DOIUrl":"https://doi.org/10.1145/3206333.3206339","url":null,"abstract":"Demands for large-scale data analysis and processing have led to the development and widespread adoption of computing frameworks that leverage in-memory data processing, largely outperforming disk-based processing systems. One such framework is Apache Spark, which adopts a lazy-evaluation execution mode. In this model, the execution of a transformation dataflow operation is delayed until its results are required by an action. Furthermore, transformation's results are not kept in memory by default, and the same transformation must be re-executed whenever required by another action. In order to spare unnecessary re-execution of entire pipelines of frequently referenced operations, Spark enables the programmer to explicitly define a cache operation to persist transformation results. However, many factors affect the efficiency of a cache in a dataflow, including the existence of other cache operations. Thus, even with a reasonably small number of transformations, choosing the optimal combination of cache operations poses a nontrivial problem. The problem is highlighted by the fact that intuitive strategies -- especially considered in isolation - may actually be harmful to the dataflow efficiency. In this work, we present an automatic procedure to compute the substantially optimal combination of cache operations given a dataflow definition and a simple model for the cost of the operations. Our results over an astronomy dataflow use case show that our algorithm is resilient to changes in the dataflow and cost model, and that it outperforms intuitive strategies, consistently deciding on a substantially optimal combination of caches.","PeriodicalId":253916,"journal":{"name":"Proceedings of the 5th ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133073732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive MapReduce Similarity Joins","authors":"Samuel McCauley, Francesco Silvestri","doi":"10.1145/3206333.3206340","DOIUrl":"https://doi.org/10.1145/3206333.3206340","url":null,"abstract":"Similarity joins are a fundamental database operation. Given data sets S and R, the goal of a similarity join is to find all points x ∈ S and y ∈ R with distance at most r. Recent research has investigated how locality-sensitive hashing (LSH) can be used for similarity join, and in particular two recent lines of work have made exciting progress on LSH-based join performance. Hu, Tao, and Yi (PODS 17) investigated joins in a massively parallel setting, showing strong results that adapt to the size of the output. Meanwhile, Ahle, Aumüller, and Pagh (SODA 17) showed a sequential algorithm that adapts to the structure of the data, matching classic bounds in the worst case but improving them significantly on more structured data. We show that this adaptive strategy can be adapted to the parallel setting, combining the advantages of these approaches. In particular, we show that a simple modification to Hu et al.'s algorithm achieves bounds that depend on the density of points in the dataset as well as the total outsize of the output. Our algorithm uses no extra parameters over other LSH approaches (in particular, its execution does not depend on the structure of the dataset), and is likely to be efficient in practice.","PeriodicalId":253916,"journal":{"name":"Proceedings of the 5th ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114472773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the 5th ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond","authors":"F. Afrati, J. Sroka, J. Hidders","doi":"10.1145/3206333","DOIUrl":"https://doi.org/10.1145/3206333","url":null,"abstract":"The papers in this volume were presented at the 3rd International Workshop on Algorithms and Systems for MapReduce and Beyond (BeyondMR 2016), held in San Francisco, CA, US on July 1, 2016. The workshop was co-located with ACM SIGMOD, and attracted 19 submissions, of which 10 were selected by the program committee for oral presentation and for publication in this volume. This corresponds to an acceptance rate of 53%, which indicates the high level of activity in the domain of the workshop and its ability to attract many good papers.","PeriodicalId":253916,"journal":{"name":"Proceedings of the 5th ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127168047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}