{"title":"CMOA: continuous moving object anonymization","authors":"Tsubasa Takahashi, Shinya Miyakawa","doi":"10.1145/2351476.2351486","DOIUrl":"https://doi.org/10.1145/2351476.2351486","url":null,"abstract":"This paper proposes a continuous anonymization method for a trajectory stream. In today's mobile environment, positions of moving objects are frequently sensed and collected. For real-time movement pattern analyses of people and automobiles, trajectory streams have attracted a lot of attention. Trajectory streams lead to sensitive locations, such as homes and personal hospitals. Additionally, a set of spatio-temporal data might identify a user from a trajectory stream. Therefore, publishing original trajectory streams may cause critical breaches of privacy. To protect privacy of users, we need a mechanism which makes it difficult to identify users from crowds of trajectory streams. Several techniques for anonymizing trajectories have been proposed. Anonymized trajectories can be published without concerning about privacy issues. However, for the continuous publishing of trajectory streams, existing trajectory anonymization methods are not suitable because they anonymize the overall trajectories at a time. If the existing methods are applied in the continuous publishing, the resolution of anonymized trajectory is hugely degraded or trace-ability is lost. In this paper, we propose an anonymization technique for a trajectory stream. The method continuously anonymizes trajectory streams one by one, and dynamically reforms anonymized trajectory streams to improve the resolution. The experiments showed that our method could keep the resolution at a constant level.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"53 1","pages":"81-90"},"PeriodicalIF":0.0,"publicationDate":"2012-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81898122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the semantics of ST4SQL, a multidimensional spatio-temporal query language","authors":"G. Pozzani, Combi Carlo","doi":"10.1145/2351476.2351504","DOIUrl":"https://doi.org/10.1145/2351476.2351504","url":null,"abstract":"In Pozzani and Combi proposed ST4SQL, an SQL-based query language extending SQL with new constructs for querying spatio-temporal data. In particular ST4SQL deals with different temporal and spatial semantics, allowing one to specify how the system has to manage temporal and spatial dimensions for evaluating queries. Moreover, the query language introduces new constructs for grouping data with respect to temporal and spatial dimensions. All proposed constructs take into account data qualified with granularities [2]. In this paper we briefly present ST4SQL and we present, also through some examples, its semantics with respect to the standard SQL one.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"27 1","pages":"222-229"},"PeriodicalIF":0.0,"publicationDate":"2012-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84503240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SciQL: bridging the gap between science and relational DBMS","authors":"Y. Zhang, M. Kersten, M. Ivanova, N. Nes","doi":"10.1145/2076623.2076639","DOIUrl":"https://doi.org/10.1145/2076623.2076639","url":null,"abstract":"Scientific discoveries increasingly rely on the ability to efficiently grind massive amounts of experimental data using database technologies. To bridge the gap between the needs of the Data-Intensive Research fields and the current DBMS technologies, we propose SciQL (pronounced as 'cycle'), the first SQL-based query language for scientific applications with both tables and arrays as first class citizens. It provides a seamless symbiosis of array-, set- and sequence-interpretations. A key innovation is the extension of value-based grouping of SQL:2003 with structural grouping, i.e., fixed-sized and unbounded groups based on explicit relationships between elements positions. This leads to a generalisation of window-based query processing with wide applicability in science domains. This paper describes the main language features of SciQL and illustrates it using time-series concepts.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"6 1","pages":"124-133"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87175077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Answering complex structured queries over the deep web","authors":"Fan Wang, G. Agrawal","doi":"10.1145/2076623.2076638","DOIUrl":"https://doi.org/10.1145/2076623.2076638","url":null,"abstract":"A large part of the data on the World Wide Web resides in the deep web. Most deep web data sources only support simple text interfaces for querying them, which are easy to use but have limited expressive power. Therefore, processing complex structured queries over the deep web currently involves a large amount of manual work. Our work focuses on addressing the existing gap between users' need of expressing and executing complex structured queries over the deep web, and the simple and limited input interfaces of the existing deep web data sources.\u0000 This paper presents a query planning problem formulation, with novel algorithms and optimizations, for enabling a high-level and highly expressive query language to be supported over deep web data sources. We particularly target three types of complex queries, which are select-project-join queries, aggregation queries, and nested queries. We have developed query planning algorithms to generate query plans for each of these, and propose several optimization techniques to further speedup query plan execution.\u0000 In our experiments, we show our algorithm has good scalability and furthermore, for over 90% of the experimental queries, the execution time and result quality of the query plans generated by our algorithms are very close to the optimal plans generated by an exhaustive search algorithm. Furthermore, our optimization techniques outperform an existing optimization method in terms of both reduction in transmitted data records and query execution speedups.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"16 1","pages":"115-123"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73769689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"M-TOP: multi-target operator placement of query graphs for data streams","authors":"N. Cipriani, O. Schiller, B. Mitschang","doi":"10.1145/2076623.2076631","DOIUrl":"https://doi.org/10.1145/2076623.2076631","url":null,"abstract":"Nowadays, many applications processes stream-based data, such as financial market analysis, network intrusion detection, or visualization applications. To process stream-based data in an application-independent manner, distributed stream processing systems emerged. They typically translate a query to an operator graph, place the operators to stream processing nodes, and execute them to process the streamed data. The operator placement is crucial in such systems, as it deeply influences query execution. Often, different stream-based applications require dedicated placement of query graphs according to their specific objectives, e.g. bandwidth not less than 500 MBit/s and costs not more that 1 cost unit. This fact constraints operator placement. Existing approaches do not take into account application-specific objectives, thus not reflecting application-specific placement decisions. As objectives might conflict among each other, operator placement is subject to delicate trade-offs, such as bandwidth maximization is more important than cost reduction. Thus, the challenge is to find a solution which considers the application-specific objectives and their trade-offs.\u0000 We present M-TOP, an QoS-aware multi-target operator placement framework for data stream systems. Particularly, we propose an operator placement strategy considering application-specific targets consisting of objectives, their respective trade-offs specifications, bottleneck conditions, and ranking schemes to compute a suitable placement. We integrated M-TOP into NexusDS, our distributed data stream processing middleware, and provide an experimental evaluation to show the effectiveness of M-TOP.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"88 1","pages":"52-60"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80264455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Luis Barguñó, V. Muntés-Mulero, David Dominguez-Sal, P. Valduriez
{"title":"ParallelGDB: a parallel graph database based on cache specialization","authors":"Luis Barguñó, V. Muntés-Mulero, David Dominguez-Sal, P. Valduriez","doi":"10.1145/2076623.2076643","DOIUrl":"https://doi.org/10.1145/2076623.2076643","url":null,"abstract":"The need for managing massive attributed graphs is becoming common in many areas such as recommendation systems, proteomics analysis, social network analysis or bibliographic analysis. This is making it necessary to move towards parallel systems that allow managing graph databases containing millions of vertices and edges. Previous work on distributed graph databases has focused on finding ways to partition the graph to reduce network traffic and improve execution time. However, partitioning a graph and keeping the information regarding the location of vertices might be unrealistic for massive graphs. In this paper, we propose Parallel-GDB, a new system based on specializing the local caches of any node in this system, providing a better cache hit ratio. ParallelGDB uses a random graph partitioning, avoiding complex partition methods based on the graph topology, that usually require managing extra data structures. This proposed system provides an efficient environment for distributed graph databases.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"17 1","pages":"162-169"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89390647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A statically typed query language for property graphs","authors":"Norbert Tausch, M. Philippsen, Josef Adersberger","doi":"10.1145/2076623.2076653","DOIUrl":"https://doi.org/10.1145/2076623.2076653","url":null,"abstract":"Applications that work on network-oriented data often use property graph models. Although their graph data is represented by an object-oriented model, current approaches cannot define statically typed vertex and edge sets. Thus, custom graph operations use untyped input and output sets and cannot exploit crucial concepts like polymorphism. Not only do illegal calling contexts or arguments result in runtime errors or unexpected query results, but also the resulting code tends to be error prone, unclear, and thus hard to maintain. To solve these problems, we extend the property graph model with typed graph classes and open it up to polymorphism. Our approach is an internal domain specific language for graph traversals based on the object-oriented and functional programming language Scala. A case study emphasizes the usability of our framework.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"40 1","pages":"219-225"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86328920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Chimera: data sharing flexibility, shared nothing simplicity","authors":"U. F. Minhas, D. Lomet, C. Thekkath","doi":"10.1145/2076623.2076642","DOIUrl":"https://doi.org/10.1145/2076623.2076642","url":null,"abstract":"The current database market is fairly evenly split between shared nothing and data sharing systems. While shared nothing systems are easier to build and scale, data sharing systems have advantages in load balancing. In this paper we explore adding data sharing functionality as an extension to a shared nothing database system. Our approach isolates the data sharing functionality from the rest of the system and relies on well-studied, robust techniques to provide the data sharing extension. This reduces the difficulty in providing data sharing functionality, yet provides much of the flexibility of a data sharing system. We present the design and implementation of Chimera -- a hybrid database system, targeted at load balancing for many workloads, and scale-out for read-mostly workloads. The results of our experiments demonstrate that we can achieve almost linear scalability and effective load balancing with less than 2% overhead during normal operation.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"27 1","pages":"152-161"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80018595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Flexible approximate counting","authors":"S. Mitchell, D. Day","doi":"10.1145/2076623.2076655","DOIUrl":"https://doi.org/10.1145/2076623.2076655","url":null,"abstract":"Approximate counting [18] is useful for data stream and database summarization. It can help in many settings that allow only one pass over the data, want low memory usage, and can accept some relative error. Approximate counters use fewer bits; we focus on 8-bits but our results are general. These small counters represent a sparse sequence of larger numbers. Counters are incremented probabilistically based on the spacing between the numbers they represent. Our contributions are a customized distribution of counter values and efficient strategies for deciding when to increment them.\u0000 At run-time, users may independently select the spacing (accuracy) of the approximate counter for small, medium, and large values. We allow the user to select the maximum number to count up to, and our algorithm will select the exponential base of the spacing. These provide additional flexibility over both classic and Csűrös's [4] floating-point approximate counting. These provide additional structure, a useful schema for users, over Kruskal and Greenberg [13].\u0000 We describe two new and efficient strategies for incrementing approximate counters: use a deterministic countdown or sample from a geometric distribution. In Csűrös all increments are powers of two, so random bits rather than full random numbers can be used. We also provide the option to use powers-of-two but retain flexibility. We show when each strategy is fastest in our implementation.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"14 1","pages":"233-239"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73540828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An approach towards automatic workflow composition through information retrieval","authors":"David Chiu, T. Hall, Farhana Kabir, G. Agrawal","doi":"10.1145/2076623.2076644","DOIUrl":"https://doi.org/10.1145/2076623.2076644","url":null,"abstract":"Understanding how to design, manage, and execute scientific workflows has become increasingly esoteric. Yet, despite the development of scientific workflow management systems, which have simplified workflow planning to some extent, a means to reduce the complexity of user interaction without forfeiting some robustness has been elusive. We believe that a keyword interface may be highly beneficial to common users in need of information which requires workflow planning and execution. In this paper, we describe a system that can automatically compose a set of relevant workflows, which may or may not have been previously defined by other users, given only a keyword query. We present a way to index data sets and Web services (utilized to compose workflows in our system) on their ontological attributes. This ontology allows us to facilitate an IR-based workflow retrieval model. We conducted a case study in geoinformatics with a set of real geospatial Web services, data, and their metadata annotations. our system was capable of answering six keyword queries with fast search times (2.16ms on average) and relatively high Top-N precision values: 78%, 77.3%, and 76.2% for the Top 3, 5, and 10 retrieved workflows respectively.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"4 1","pages":"170-178"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79709991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}