{"title":"CHIC: a combination-based recommendation system","authors":"Manasi Vartak, S. Madden","doi":"10.1145/2463676.2465270","DOIUrl":"https://doi.org/10.1145/2463676.2465270","url":null,"abstract":"Current recommender systems are focused largely on recommending items based on similarity. For instance, Netflix can recommend movies similar to previously viewed movies, and Amazon can recommend items based on ratings of similar users. Although similarity-based recommendation works well for books and movies, it provides an incomplete solution for items such as clothing or furniture which are inherently used in combination with other items of the same type, e.g., shirt with pants, and desk with a chair. As a result, the decision to buy a clothing or furniture item depends not only on the item itself, but also on how well it works with other items of that type. Recommending such items therefore requires a combination-based recommendation system that given an item, can suggest interesting and diverse combinations containing that item. This problem is challenging because features affecting combination quality are often difficult to identify; quality, being a function of all items in the combination, cannot be computed independently; and there are an exponential number of combinations to explore. In this demonstration, we present CHIC, a first-of-its-kind, combination-based recommendation system for clothing. The audience will interact with our system through the CHIC mobile app which allows the user to take a picture of a clothing item and search for interesting combinations containing the item instantly. The audience can also compete with CHIC to create alternate ensembles and compare quality. Finally, we highlight via visualizations the core modules of CHIC including model building and our novel search and classification algorithm, C-Search.","PeriodicalId":87344,"journal":{"name":"Proceedings. ACM-SIGMOD International Conference on Management of Data","volume":"145 4 1","pages":"981-984"},"PeriodicalIF":0.0,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83079990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
I. Konstantinou, Verena Kantere, Dimitrios Tsoumakos, N. Koziris
{"title":"COCCUS: self-configured cost-based query services in the cloud","authors":"I. Konstantinou, Verena Kantere, Dimitrios Tsoumakos, N. Koziris","doi":"10.1145/2463676.2465233","DOIUrl":"https://doi.org/10.1145/2463676.2465233","url":null,"abstract":"Recently, a large number of pay-as-you-go data services are offered over cloud infrastructures. Data service providers need appropriate and flexible query charging mechanisms and query optimization that take into consideration cloud operational expenses, pricing strategies and user preferences. Yet, existing solutions are static and non-configurable. We demonstrate COCCUS, a modular system for cost-aware query execution, adaptive query charge and optimization of cloud data services. The audience can set their queries along with their execution preferences and budget constraints, while COCCUS adaptively determines query charge and manages secondary data structures according to various economic policies. We demonstrate COCCUS's operation over centralized and shared nothing CloudDBMS architectures on top of public and private IaaS clouds. The audience is enabled to set economic policies and execute various workloads through a comprehensive GUI. COCCUS's adaptability is showcased using real-time graphs depicting a number of key performance metrics.","PeriodicalId":87344,"journal":{"name":"Proceedings. ACM-SIGMOD International Conference on Management of Data","volume":"162 8 1","pages":"1041-1044"},"PeriodicalIF":0.0,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83291361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yannis Klonatos, Andres Nötzli, A. Spielmann, Christoph E. Koch, Viktor Kunčak
{"title":"Automatic synthesis of out-of-core algorithms","authors":"Yannis Klonatos, Andres Nötzli, A. Spielmann, Christoph E. Koch, Viktor Kunčak","doi":"10.1145/2463676.2465334","DOIUrl":"https://doi.org/10.1145/2463676.2465334","url":null,"abstract":"We present a system for the automatic synthesis of efficient algorithms specialized for a particular memory hierarchy and a set of storage devices. The developer provides two independent inputs: 1) an algorithm that ignores memory hierarchy and external storage aspects; and 2) a description of the target memory hierarchy, including its topology and parameters. Our system is able to automatically synthesize memory-hierarchy and storage-device-aware algorithms out of those specifications, for tasks such as joins and sorting. The framework is extensible and allows developers to quickly synthesize custom out-of-core algorithms as new storage technologies become available.","PeriodicalId":87344,"journal":{"name":"Proceedings. ACM-SIGMOD International Conference on Management of Data","volume":"11 1","pages":"133-144"},"PeriodicalIF":0.0,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78830313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FAST: differentially private real-time aggregate monitor with filtering and adaptive sampling","authors":"Liyue Fan, Li Xiong, V. Sunderam","doi":"10.1145/2463676.2465253","DOIUrl":"https://doi.org/10.1145/2463676.2465253","url":null,"abstract":"Sharing aggregate statistics of private data can be of great value when data mining can be performed in real-time to understand important phenomena such as influenza outbreaks or traffic congestion. However, to this date there have been no tools for releasing real-time aggregated data with differential privacy, a strong and provable privacy guarantee. We propose FAST, a real-time system that allows differentially private aggregate sharing and time-series analytics. FAST employs a set of novel, adaptive strategies to improve the utility of shared/released data while guaranteeing the user-specified level of differential privacy. We will demonstrate the challenges and our solutions in the context of prepared data sets as well as live participation data dynamically collected among the SIGMOD'13 attendees.","PeriodicalId":87344,"journal":{"name":"Proceedings. ACM-SIGMOD International Conference on Management of Data","volume":"64 1","pages":"1065-1068"},"PeriodicalIF":0.0,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91261823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mike Barnett, B. Chandramouli, R. Deline, S. Drucker, Danyel Fisher, J. Goldstein, P. Morrison, John C. Platt
{"title":"Stat!: an interactive analytics environment for big data","authors":"Mike Barnett, B. Chandramouli, R. Deline, S. Drucker, Danyel Fisher, J. Goldstein, P. Morrison, John C. Platt","doi":"10.1145/2463676.2463683","DOIUrl":"https://doi.org/10.1145/2463676.2463683","url":null,"abstract":"Exploratory analysis on big data requires us to rethink data management across the entire stack -- from the underlying data processing techniques to the user experience. We demonstrate Stat! -- a visualization and analytics environment that allows users to rapidly experiment with exploratory queries over big data. Data scientists can use Stat! to quickly refine to the correct query, while getting immediate feedback after processing a fraction of the data. Stat! can work with multiple processing engines in the backend; in this demo, we use Stat! with the Microsoft StreamInsight streaming engine. StreamInsight is used to generate incremental early results to queries and refine these results as more data is processed. Stat! allows data scientists to explore data, dynamically compose multiple queries to generate streams of partial results, and display partial results in both textual and visual form.","PeriodicalId":87344,"journal":{"name":"Proceedings. ACM-SIGMOD International Conference on Management of Data","volume":"47 1","pages":"1013-1016"},"PeriodicalIF":0.0,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89961005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Omkar Deshpande, Digvijay S. Lamba, Michel Tourn, Sanjib Das, S. Subramaniam, A. Rajaraman, Venky Harinarayan, A. Doan
{"title":"Building, maintaining, and using knowledge bases: a report from the trenches","authors":"Omkar Deshpande, Digvijay S. Lamba, Michel Tourn, Sanjib Das, S. Subramaniam, A. Rajaraman, Venky Harinarayan, A. Doan","doi":"10.1145/2463676.2465297","DOIUrl":"https://doi.org/10.1145/2463676.2465297","url":null,"abstract":"A knowledge base (KB) contains a set of concepts, instances, and relationships. Over the past decade, numerous KBs have been built, and used to power a growing array of applications. Despite this flurry of activities, however, surprisingly little has been published about the end-to-end process of building, maintaining, and using such KBs in industry. In this paper we describe such a process. In particular, we describe how we build, update, and curate a large KB at Kosmix, a Bay Area startup, and later at WalmartLabs, a development and research lab of Walmart. We discuss how we use this KB to power a range of applications, including query understanding, Deep Web search, in-context advertising, event monitoring in social media, product search, social gifting, and social mining. Finally, we discuss how the KB team is organized, and the lessons learned. Our goal with this paper is to provide a real-world case study, and to contribute to the emerging direction of building, maintaining, and using knowledge bases for data management applications.","PeriodicalId":87344,"journal":{"name":"Proceedings. ACM-SIGMOD International Conference on Management of Data","volume":"1 1","pages":"1209-1220"},"PeriodicalIF":0.0,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83507985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Barzan Mozafari, C. Curino, Alekh Jindal, S. Madden
{"title":"Performance and resource modeling in highly-concurrent OLTP workloads","authors":"Barzan Mozafari, C. Curino, Alekh Jindal, S. Madden","doi":"10.1145/2463676.2467800","DOIUrl":"https://doi.org/10.1145/2463676.2467800","url":null,"abstract":"Database administrators of Online Transaction Processing (OLTP) systems constantly face difficult questions. For example, \"What is the maximum throughput I can sustain with my current hardware?\", \"How much disk I/O will my system perform if the requests per second double?\", or \"What will happen if the ratio of transactions in my system changes?\". Resource prediction and performance analysis are both vital and difficult in this setting. Here the challenge is due to high degrees of concurrency, competition for resources, and complex interactions between transactions, all of which non-linearly impact performance.\u0000 Although difficult, such analysis is a key component in enabling database administrators to understand which queries are eating up the resources, and how their system would scale under load. In this paper, we introduce our framework, called DBSeer, that addresses this problem by employing statistical models that provide resource and performance analysis and prediction for highly concurrent OLTP workloads. Our models are built on a small amount of training data from standard log information collected during normal system operation. These models are capable of accurately measuring several performance metrics, including resource consumption on a per-transaction-type basis, resource bottlenecks, and throughput at different load levels. We have validated these models on MySQL/Linux with numerous experiments on standard benchmarks (TPC-C) and real workloads (Wikipedia), observing high accuracy (within a few percent error) when predicting all of the above metrics.","PeriodicalId":87344,"journal":{"name":"Proceedings. ACM-SIGMOD International Conference on Management of Data","volume":"264 1","pages":"301-312"},"PeriodicalIF":0.0,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75922561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Georgiadis, Maria Kontaki, A. Gounaris, A. Papadopoulos, K. Tsichlas, Y. Manolopoulos
{"title":"Continuous outlier detection in data streams: an extensible framework and state-of-the-art algorithms","authors":"D. Georgiadis, Maria Kontaki, A. Gounaris, A. Papadopoulos, K. Tsichlas, Y. Manolopoulos","doi":"10.1145/2463676.2463691","DOIUrl":"https://doi.org/10.1145/2463676.2463691","url":null,"abstract":"Anomaly detection is an important data mining task, aiming at the discovery of elements that show significant diversion from the expected behavior; such elements are termed as outliers. One of the most widely employed criteria for determining whether an element is an outlier is based on the number of neighboring elements within a fixed distance (R), against a fixed threshold (k). Such outliers are referred to as distance-based outliers and are the focus of this work. In this demo, we show both an extendible framework for outlier detection algorithms and specific outlier detection algorithms for the demanding case where outlier detection is continuously performed over a data stream. More specifically: i) first we demonstrate a novel flavor of an open-source publicly available tool for Massive Online Analysis (MOA) that is endowed with capabilities to encapsulate algorithms that continuously detect outliers and ii) second, we present four online outlier detection algorithms. Two of these algorithms have been designed by the authors of this demo, with a view to improving on key aspects related to outlier mining, such as running time, flexibility and space requirements.","PeriodicalId":87344,"journal":{"name":"Proceedings. ACM-SIGMOD International Conference on Management of Data","volume":"46 1","pages":"1061-1064"},"PeriodicalIF":0.0,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87552472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stavros Papadopoulos, Graham Cormode, Antonios Deligiannakis, M. Garofalakis
{"title":"Lightweight authentication of linear algebraic queries on data streams","authors":"Stavros Papadopoulos, Graham Cormode, Antonios Deligiannakis, M. Garofalakis","doi":"10.1145/2463676.2465281","DOIUrl":"https://doi.org/10.1145/2463676.2465281","url":null,"abstract":"We consider a stream outsourcing setting, where a data owner delegates the management of a set of disjoint data streams to an untrusted server. The owner authenticates his streams via signatures. The server processes continuous queries on the union of the streams for clients trusted by the owner. Along with the results, the server sends proofs of result correctness derived from the owner's signatures, which are easily verifiable by the clients. We design novel constructions for a collection of fundamental problems over streams represented as linear algebraic queries. In particular, our basic schemes authenticate dynamic vector sums and dot products, as well as dynamic matrix products. These techniques can be adapted for authenticating a wide range of important operations in streaming environments, including group by queries, joins, in-network aggregation, similarity matching, and event processing. All our schemes are very lightweight, and offer strong cryptographic guarantees derived from formal definitions and proofs. We experimentally confirm the practicality of our schemes.","PeriodicalId":87344,"journal":{"name":"Proceedings. ACM-SIGMOD International Conference on Management of Data","volume":"53 1","pages":"881-892"},"PeriodicalIF":0.0,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90917470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Arasu, Spyros Blanas, Ken Eguro, Manas R. Joglekar, R. Kaushik, Donald Kossmann, Ravishankar Ramamurthy, P. Upadhyaya, R. Venkatesan
{"title":"Secure database-as-a-service with Cipherbase","authors":"A. Arasu, Spyros Blanas, Ken Eguro, Manas R. Joglekar, R. Kaushik, Donald Kossmann, Ravishankar Ramamurthy, P. Upadhyaya, R. Venkatesan","doi":"10.1145/2463676.2467797","DOIUrl":"https://doi.org/10.1145/2463676.2467797","url":null,"abstract":"Data confidentiality is one of the main concerns for users of public cloud services. The key problem is protecting sensitive data from being accessed by cloud administrators who have root privileges and can remotely inspect the memory and disk contents of the cloud servers. While encryption is the basic mechanism that can leveraged to provide data confidentiality, providing an efficient database-as-a-service that can run on encrypted data raises several interesting challenges. In this demonstration we outline the functionality of Cipherbase --- a full fledged SQL database system that supports the full generality of a database system while providing high data confidentiality. Cipherbase has a novel architecture that tightly integrates custom-designed trusted hardware for performing operations on encrypted data securely such that an administrator cannot get access to any plaintext corresponding to sensitive data.","PeriodicalId":87344,"journal":{"name":"Proceedings. ACM-SIGMOD International Conference on Management of Data","volume":"6 1","pages":"1033-1036"},"PeriodicalIF":0.0,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91153126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}