{"title":"Box queries over multi-dimensional streams","authors":"R. Friedman, Rana Shahout","doi":"10.1145/3465480.3466925","DOIUrl":"https://doi.org/10.1145/3465480.3466925","url":null,"abstract":"Answering statistical queries about streams of online arriving data is becoming increasingly important. Often, such data includes multiple-attributes, so data elements can be viewed as points in a multi-dimensional universe. This paper extends existing works on streaming algorithms by studying the ability to perform box queries on online multi-dimensional data streams. We develop three algorithms C-DARQ, DARQ and MARQ that support such capabilities for a large number of statistical functions including (but not limited to) counting, frequency estimation, heavy-hitters etc. The protocols are analyzed and evaluated over synthetic and datasets from Kaggle in multiple dimensions (up to 8). Our algorithms asymptotically improve the space bounds as well as update and query performance of existing works. Unlike known approaches, our algorithms can also be used to solve a larger class of problems beyond counting. We further discuss extending our work to the sliding window model and when the dimensions' bounds are a-priori unknown.","PeriodicalId":217173,"journal":{"name":"Proceedings of the 15th ACM International Conference on Distributed and Event-based Systems","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132487190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HawkEDA: a tool for quantifying data integrity violations in event-driven microservices","authors":"Prangshuman Das, Rodrigo Laigner, Yongluan Zhou","doi":"10.1145/3465480.3467838","DOIUrl":"https://doi.org/10.1145/3465480.3467838","url":null,"abstract":"A microservice architecture advocates for subdividing an application into small and independent components, each communicating via well-defined APIs or asynchronous events, to allow for higher scalability, availability, and fault isolation. However, the implementation of substantial amount of data management logic at the application-tier and the existence of functional dependencies cutting across microservices create a great barrier for developers to reason about application safety and performance trade-offs. To fill this gap, this work presents HawkEDA, the first data management tool that allows practitioners to experiment their microservice applications with different real-world workloads to quantify the amount of data integrity anomalies. In our demonstration, we present a case study of a popular open-source event-driven microservice to showcase the interface through which developers specify application semantics and the flexibility of HawkEDA.","PeriodicalId":217173,"journal":{"name":"Proceedings of the 15th ACM International Conference on Distributed and Event-based Systems","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123820716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An experimental framework for improving the performance of BFT consensus for future permissioned blockchains","authors":"Man-Kit Sit, Manuel Bravo, Z. István","doi":"10.1145/3465480.3466922","DOIUrl":"https://doi.org/10.1145/3465480.3466922","url":null,"abstract":"Permissioned Blockchains are increasingly considered in enterprise use-cases, many of which do not require geo-distribution, or even disallow it due to legislation. Examples include countrywide networks, such as Alastria, or those deployed using cloud-based platforms such as IBM Blockchain Platform. We expect these blockchains to eventually run in environments with high bandwidth and low latency modern networks, as well as with advanced programmable hardware accelerators. Even though there is renewed interest in BFT consensus algorithms with various proposals targeting Permissioned Blockchains, related work does not optimize for fast networks and does not incorporate hardware accelerators - we make the case that doing so will pay off in the long run. To this end, we re-implemented the seminal PBFT algorithm in a way that allows us to measure different configurations of the protocol. Through this we explore the benefits of various common optimization strategies and show that the protocol is unlikely to saturate more than 10Gbps networks without relying on specialized hardware-based offloading. Based on the experimental results, we discuss two concrete ways in which the cost of consensus in Permissioned Blockchains could be reduced in high-speed networking environments, namely, offloading to SmartNICs and implementing the protocol on standalone FPGAs.","PeriodicalId":217173,"journal":{"name":"Proceedings of the 15th ACM International Conference on Distributed and Event-based Systems","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133849906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jawad Tahir, Christoph Doblander, R. Mayer, S Frischbier, H. Jacobsen
{"title":"The DEBS 2021 grand challenge: analyzing environmental impact of worldwide lockdowns","authors":"Jawad Tahir, Christoph Doblander, R. Mayer, S Frischbier, H. Jacobsen","doi":"10.1145/3465480.3467836","DOIUrl":"https://doi.org/10.1145/3465480.3467836","url":null,"abstract":"The ACM DEBS 2021 Grand Challenge (GC) is the eleventh episode of a series of programming challenge competitions that began in 2011. Every year, participants of the GC are provided with new datasets and practical problems, and the challenge receives novel and high performant solutions from research, academia, and industry. The theme of the DEBS '21 GC is analyzing the environmental effects of COVID-19 restrictions. This year's edition of the GC is the first to explicitly focus on the integration and practicability of the solutions by fostering the use of distributed solutions based on widely-used open-source platforms and by requiring participants to address non-functional properties besides correctness of the solution. This paper describes the dataset used, formalizes the problem statement, and explains the evaluation platform that made dataset distribution and remote evaluation possible with our new virtualized infrastructure.","PeriodicalId":217173,"journal":{"name":"Proceedings of the 15th ACM International Conference on Distributed and Event-based Systems","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117003902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enforcing consistency in microservice architectures through event-based constraints","authors":"Anna Lesniak, Rodrigo Laigner, Yongluan Zhou","doi":"10.1145/3465480.3467839","DOIUrl":"https://doi.org/10.1145/3465480.3467839","url":null,"abstract":"Microservice architectures are an emerging paradigm for developing event-driven applications. By prescribing that an application is decomposed into small and independent components, each encapsulating its own state and communicating via asynchronous events, new components and events can be easily integrated into the system. However, by pursuing a model where events are generated and processed at the application-level, developers have a hard time to safeguard arbitrary event interleavings from doing harm to application safety. To address these challenges, we start by analyzing event-driven microservice open-source applications to identify unsafe interleavings. Next, we categorize event-based constraints to address such unsafe encodings, providing an easy-to-use guide for microservice developers. Finally, we introduce StreamConstraints, a library built on top of Kafka Streams designed to enforce explicit event-based constraints defined by developers. We showcase StreamConstraints based on the case of a popular event-driven microservice system, and demonstrate how it could benefit from event-based constraints to ensure application safety.","PeriodicalId":217173,"journal":{"name":"Proceedings of the 15th ACM International Conference on Distributed and Event-based Systems","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122037819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zachary Painter, Victor Cook, Christina L. Peterson, D. Dechev
{"title":"Descriptor based consensus for blockchain transactions","authors":"Zachary Painter, Victor Cook, Christina L. Peterson, D. Dechev","doi":"10.1145/3465480.3466927","DOIUrl":"https://doi.org/10.1145/3465480.3466927","url":null,"abstract":"Blockchain networks use consensus mechanisms so participants can exchange transactions without the need to rely on a trusted third party. Consensus mechanisms using Proof of Work burn significant energy to select a block miner and the delay limits performance. Other consensus mechanisms such as Proof of Stake or Practical Byzantine Fault Tolerance still designate a single validator to append a block to the chain, preventing blocks from being built and published in parallel. In this paper we introduce a new consensus mechanism, Proof of Descriptor, enabling clients to work together to publish blockchain transactions using a descriptor object which stores information on the cooperative parallel execution of transactions. Proof of Descriptor consensus allows commutative transactions to be mined individually. It does not require a leader to propose the next block, enabling clients to cooperate on completing transactions, assembling blocks and publishing them. We demonstrate that our approach is less prone to attack since it is not vulnerable to a malicious leader, while simulations show a potential 20x improvement over the fastest sequential blockchain, Solana.","PeriodicalId":217173,"journal":{"name":"Proceedings of the 15th ACM International Conference on Distributed and Event-based Systems","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131483832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Explainable anomaly detection on high-dimensional time series data","authors":"Bijan Rad, Fei Song, Vincent Jacob, Y. Diao","doi":"10.1145/3465480.3468292","DOIUrl":"https://doi.org/10.1145/3465480.3468292","url":null,"abstract":"As enterprise information systems are collecting event streams from various sources, the ability of a system to automatically detect anomalous events and further provide human-readable explanations is of paramount importance. In this paper, we present an approach to integrated anomaly detection (AD) and explanation discovery (ED), which is able to leverage state-of-the-art Deep Learning (DL) techniques for anomaly detection, while being able to recover human-readable explanations for detected anomalies. At the core of the framework is a new human-interpretable dimensionality reduction (HIDR) method that not only reduces the dimensionality of the data, but also maintains a meaningful mapping from the original features to the transformed low-dimensional features. Such transformed features can be fed into any DL technique designed for anomaly detection, and the feature mapping will be used to recover human-readable explanations through a suite of new feature selection and explanation discovery methods. Evaluation using a recent explainable anomaly detection benchmark demonstrates the efficiency and effectiveness of HIDR for AD, and the result that while all three recent ED techniques failed to generate quality explanations on high-dimensional data, our HIDR-based ED framework can enable them to generate explanations with dramatic improvements in the quality of explanations and computational efficiency.","PeriodicalId":217173,"journal":{"name":"Proceedings of the 15th ACM International Conference on Distributed and Event-based Systems","volume":"217 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128169802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Antonios Kontaxakis, Antonios Deligiannakis, Holger Arndt, Stefan Burkard, Claus-Peter Kettner, Elke Pelikan, Kathleen Noack
{"title":"Real-time processing of geo-distributed financial data","authors":"Antonios Kontaxakis, Antonios Deligiannakis, Holger Arndt, Stefan Burkard, Claus-Peter Kettner, Elke Pelikan, Kathleen Noack","doi":"10.1145/3465480.3467842","DOIUrl":"https://doi.org/10.1145/3465480.3467842","url":null,"abstract":"Enabling real-time processing of financial data streams is extremely challenging, especially considering that typical operations that interest investors often require combining data across (a potentially quadratic number of) different pairs of stocks. In this paper we present the architecture and the components of our system for the real-time processing of geo-distributed financial data at scale. Our system can scale to larger resources and utilizes a Synopses Data Engine in order to efficiently handle complex cross-stock queries, such as the ones required to detect systemic risk or to help forecast the value of some stock. The rich set of supported operations is depicted at the Visual Analytics component of our system.","PeriodicalId":217173,"journal":{"name":"Proceedings of the 15th ACM International Conference on Distributed and Event-based Systems","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126553989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Web stream processing with RSP4J","authors":"Riccardo Tommasini, P. Bonte","doi":"10.1145/3465480.3467844","DOIUrl":"https://doi.org/10.1145/3465480.3467844","url":null,"abstract":"Social Media Analysis, Internet of Things, and Fake News detection have unveiled the relevance of real-time analytics on the Web. As a consequence, the Web infrastructure is evolving to enable continuous and reactive data access. Since data streams available on the Web originate from a variety of sources, they are highly heterogeneous. Indeed, addressing data variety and velocity simultaneously is inevitable. Stream Reasoning is the research field that studies how to combine data integration techniques with stream processing technologies. In particular, solutions for RDF Stream Processing (RSP) combine stream processing notions with data integration standards. This tutorial paper presents RSP4J, a innovative API that aims at fostering the adoption of RSP by simplifying the usage, benchmarking, and fast prototyping of Web Stream Processing applications.","PeriodicalId":217173,"journal":{"name":"Proceedings of the 15th ACM International Conference on Distributed and Event-based Systems","volume":"90 24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126033542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
N. Kourtellis, H. Herodotou, M. Grzenda, P. Wawrzyniak, A. Bifet
{"title":"S2CE","authors":"N. Kourtellis, H. Herodotou, M. Grzenda, P. Wawrzyniak, A. Bifet","doi":"10.1145/3465480.3466926","DOIUrl":"https://doi.org/10.1145/3465480.3466926","url":null,"abstract":"The explosive increase in volume, velocity, variety, and veracity of data generated by distributed and heterogeneous nodes such as IoT and other devices, continuously challenge the state of art in big data processing platforms and mining techniques. Consequently, it reveals an urgent need to address the ever-growing gap between this expected exascale data generation and the extraction of insights from these data. To address this need, this position paper proposes Stream to Cloud & Edge (S2CE), a first of its kind, optimized, multi-cloud and edge orchestrator, easily configurable, scalable, and extensible. S2CE will enable machine and deep learning over voluminous and heterogeneous data streams running on hybrid cloud and edge settings, while offering the necessary functionalities for practical and scalable processing: data fusion and preprocessing, sampling and synthetic stream generation, cloud and edge smart resource management, and distributed processing.","PeriodicalId":217173,"journal":{"name":"Proceedings of the 15th ACM International Conference on Distributed and Event-based Systems","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116829161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}