19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)最新文献

筛选
英文 中文
Processing Spatial-Keyword (SK) Queries in Geographic Information Retrieval (GIR) Systems 在地理信息检索(GIR)系统中处理空间关键词(SK)查询
Ramaswamy Hariharan, B. Hore, Chen Li, S. Mehrotra
{"title":"Processing Spatial-Keyword (SK) Queries in Geographic Information Retrieval (GIR) Systems","authors":"Ramaswamy Hariharan, B. Hore, Chen Li, S. Mehrotra","doi":"10.1109/SSDBM.2007.22","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.22","url":null,"abstract":"Location-based information contained in publicly available GIS databases is invaluable for many applications such as disaster response, national infrastructure protection, crime analysis, and numerous others. The information entities of such databases have both spatial and textual descriptions. Likewise, queries issued to the databases also contain spatial and textual components, for example, \"Find shelters with emergency medical facilities in Orange County,\" or \"Find earthquake-prone zones in Southern California.\" We refer to such queries as spatial-keyword queries or SK queries for short. In recent times, a lot of interest has been generated in efficient processing of SK queries for a variety of applications from Web-search to GIS decision support systems. We refer to systems built for enabling such applications as Geographic Information Retrieval (GIR) Systems. An example GIR system that we address in this paper is a search engine built on top of hundreds of thousands of publicly available GIS databases. Building a search engine over such large repositories is a challenge. One of the key aspects of such a search engine is the performance. In this paper, we propose a framework for GIR systems and focus on indexing strategies that can process SK queries efficiently. We show through experiments that our indexing strategies lead to significant improvement in efficiency of answering SK queries over existing techniques.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127628594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 286
Database Support for Weighted Match Joins 数据库对加权匹配连接的支持
A. Kini, J. Naughton
{"title":"Database Support for Weighted Match Joins","authors":"A. Kini, J. Naughton","doi":"10.1109/SSDBM.2007.31","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.31","url":null,"abstract":"As relational database management systems are applied to non-traditional domains such as scientific data management, there is an increasing need to support queries with semantics that differ from those appropriate for traditional RDBMS applications. Two interesting ideas currently being explored in the DBMS community are ranking query results (e.g., top-k computations) and, more recently, \"match joins.\" In this paper we combine these two ideas and study weighted match joins, in which (a) like match joins, each tuple joins with at most one matching tuple, and (b) like top-k joins, the system attempts to provide a set of answer tuples that maximizes a weight function. We explore exact and approximate strategies for evaluating weighted match joins. Using a prototype implementation in PostgreSQL, we explore the performance characteristics of these strategies. Our results suggest that the DBMS optimization-based approach of providing several implementations of an operator and then choosing an appropriate one at run time can be useful in computing weighted match joins.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124676572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Reservoir Sampling over Memory-Limited Stream Joins 在内存有限的流连接上进行储层采样
Mohammed Al-Kateb, B. Lee, X. Wang
{"title":"Reservoir Sampling over Memory-Limited Stream Joins","authors":"Mohammed Al-Kateb, B. Lee, X. Wang","doi":"10.1109/SSDBM.2007.40","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.40","url":null,"abstract":"In stream join processing with limited memory, uniform random sampling is useful for approximate query evaluation. In this paper, we address the problem of reservoir sampling over memory-limited stream joins. We present two sampling algorithms, reservoir join-sampling (RJS) and progressive reservoir join-sampling (PRJS). RJS is designed straightforwardly by using a fixed-size reservoir sampling on a join-sample (i.e., random sample of a join output stream). Anytime the sample in the reservoir is used, RJS always gives a uniform random sample of the original join output stream. With limited memory, however, the available memory may not be large enough even for the join buffer, thereby severely limiting the reservoir size. PRJS alleviates this problem by increasing the reservoir size during the join-sampling. This increasing is possible since the memory requirement by the join-sampling algorithm decreases over time. A larger reservoir provides a closer representation of the original join output stream. However, it comes with a negative impact on the probability of the sample being uniform. Through experiments we examine the tradeoffs and compare the two algorithms in terms of the aggregation error on the reservoir sample.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134116795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Efficient Evaluation of Inbreeding Queries on Pedigree Data 系谱数据近交查询的高效评估
Brendan Elliott, Suleyman Fatih Akgul, Stephen Mayes, Z. M. Özsoyoglu
{"title":"Efficient Evaluation of Inbreeding Queries on Pedigree Data","authors":"Brendan Elliott, Suleyman Fatih Akgul, Stephen Mayes, Z. M. Özsoyoglu","doi":"10.1109/SSDBM.2007.12","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.12","url":null,"abstract":"We consider pedigree data structured in the form of a directed acyclic graph, and use an encoding scheme, called NodeCodes, for expediting the evaluation of queries on pedigree graph structures. Inbreeding is the quantitative measure of the genetic relationship between two individuals. The inbreeding coefficient is related to the probability that both copies of any given gene are received from the same ancestor. In this paper we discuss the evaluation of the inbreeding coefficient of a given individual using NodeCodes. We implemented and tested our approach with both synthetic and real pedigree data. Experimental results show that the use of NodeCodes provides a good alternative for queries involving the inbreeding coefficient, with significant improvements over the traditional iterative evaluation methods.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127776575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A Fast Algorithm for Approximate Quantiles in High Speed Data Streams 高速数据流中近似分位数的快速算法
Qi Zhang, Wei Wang
{"title":"A Fast Algorithm for Approximate Quantiles in High Speed Data Streams","authors":"Qi Zhang, Wei Wang","doi":"10.1109/SSDBM.2007.27","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.27","url":null,"abstract":"We present a fast algorithm for computing approximate quantiles in high speed data streams with deterministic error bounds. For data streams of size N where N is unknown in advance, our algorithm partitions the stream into sub-streams of exponentially increasing size as they arrive. For each sub-stream which has a fixed size, we compute and maintain a multi-level summary structure using a novel algorithm. In order to achieve high speed performance, the algorithm uses simple block-wise merge and sample operations. Overall, our algorithms for fixed-size streams and arbitrary-size streams have a computational cost of O(N log(1/epsivlogepsivN)) and an average per-element update cost of O(log logN) if epsiv is fixed.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134332011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
Enabling Real-Time Querying of Live and Historical Stream Data 支持实时查询实时流数据和历史流数据
Frederick Reiss, Kurt Stockinger, Kesheng Wu, A. Shoshani, J. Hellerstein
{"title":"Enabling Real-Time Querying of Live and Historical Stream Data","authors":"Frederick Reiss, Kurt Stockinger, Kesheng Wu, A. Shoshani, J. Hellerstein","doi":"10.1109/SSDBM.2007.34","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.34","url":null,"abstract":"Applications that query data streams in order to identify trends, patterns, or anomalies can often benefit from comparing the live stream data with archived historical stream data. However, searching this historical data in real time has been considered so far to be prohibitively expensive. One of the main bottlenecks is the update costs of the indices over the archived data. In this paper, we address this problem by using our highly-efficient bitmap indexing technology (called FastBit) and demonstrate that the index update operations are sufficiently efficient for this bottleneck to be removed. We describe our prototype system based on the TelegraphCQ streaming query processor and the FastBit bitmap index. We present a detailed performance evaluation of our system using a complex query workload for analyzing real network traffic data. The combined system uses TelegraphCQ to analyze streams of traffic information and FastBit to correlate current behaviors with historical trends. We demonstrate that our system can simultaneously analyze (1) live streams with high data rates and (2) a large repository of historical stream data.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121340763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 76
CSR+-tree: Cache-conscious Indexing for High-dimensional Similarity Search CSR+-tree:高维相似度搜索的缓存敏感索引
Junfeng Dong, Xiaohui Yu
{"title":"CSR+-tree: Cache-conscious Indexing for High-dimensional Similarity Search","authors":"Junfeng Dong, Xiaohui Yu","doi":"10.1109/SSDBM.2007.9","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.9","url":null,"abstract":"In this paper, we propose a novel index structure, the CSR+-tree, to support efficient high-dimensional similarity search in main memory. We introduce quantized bounding spheres (QBSs) that approximate bounding spheres (BSs) or data points. We analyze the respective pros and cons of both QBSs and the previously proposed quantized bounding rectangles (QBRs), and take the best of both worlds by carefully incorporating both of them into the CSR+-tree. We further propose a novel distance computation scheme that eliminates the need for decompressing QBSs or QBRs, which results in significant cost savings. We present an extensive experimental evaluation and analysis of the CSR+-tree, and compare its performance against that of other representative indexes in the literature. Our results show that the CSR+-tree consistently outperforms other index structures.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130310068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
iSEE: Efficient Continuous K-Nearest-Neighbor Monitoring over Moving Objects iSEE:移动对象的高效连续k近邻监测
Wei Wu, K. Tan
{"title":"iSEE: Efficient Continuous K-Nearest-Neighbor Monitoring over Moving Objects","authors":"Wei Wu, K. Tan","doi":"10.1109/SSDBM.2007.37","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.37","url":null,"abstract":"In this paper, we propose iSEE, a set of algorithms for efficient processing of continuous k-nearest-neighbor (CKNN) queries over moving objects. iSEE utilizes a grid index and incrementally updates the queries' results based on moving objects' explicit location update messages. We have three innovations in iSEE: a Visit Order Builder (VOB) method that dynamically constructs a query's optimal visit order to the cells in the grid index with low cost, an Efficient Expand (EFEX) algorithm which avoids unnecessary and redundant searching when updating a query's result, and an efficient algorithm that quickly identifies the cells that should be updated after a query's result is changed. Experimental results show that iSEE achieves a 2X speedup, when compared with the state-of-the-art CPM scheme.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115066719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
What Constitutes a Scientific Database? 什么构成科学数据库?
J. Pfaltz
{"title":"What Constitutes a Scientific Database?","authors":"J. Pfaltz","doi":"10.1109/SSDBM.2007.25","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.25","url":null,"abstract":"We propose that a scientific database should be inherently different from, say a business database. The difference is based on the nature of science itself, in which hypotheses, or logical implications, form an essential part of the discipline. Empirical observations give rise to tentative hypotheses. Individual hypotheses are then tested, refuted or refined, by further empirical observation. In the paper, we propose representing the observational data of science in a lattice format that also conveys all the logical implications that can be supported by those observations. We claim that such a structure can be incrementally created and that the hypotheses formed will adapt to new data. We demonstrate its practicality by presenting two real situations in which it has been used. Finally, we look at the rather considerable storage costs associated with this approach and discuss other limitations that are still unresolved in this new approach to the representation of scientific data.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"408 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126978930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Sensor Scheduling for Aggregate Monitoring inWireless Sensor Networks 无线传感器网络中聚合监控的传感器调度
Xingbo Yu, S. Mehrotra, N. Venkatasubramanian
{"title":"Sensor Scheduling for Aggregate Monitoring inWireless Sensor Networks","authors":"Xingbo Yu, S. Mehrotra, N. Venkatasubramanian","doi":"10.1109/SSDBM.2007.42","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.42","url":null,"abstract":"Most of the applications of wireless sensor networks involve primarily data collection with in-network processing in which continuous aggregate queries are posed and processed. There are two principle concerns with this type of applications. First, due to the use of batteries, limited power resource has been identified as a major challenge in deploying wireless sensor networks. Second, data is usually expected to be gathered as soon as possible to facilitate the monitoring of and the response to the physical phenomena. In this paper, we tackle these challenges through sensor state scheduling. The proposed technique is based on the observation that there are two types of traffic in sensor networks designed for data aggregation, bottom-up and top-down within an abstract tree structure. We show that it is possible to achieve deterministic schedules for data aggregation with very good performance. Specifically, we develop greedy algorithms to schedule transmission and listening operations for each sensor node to achieve collision- free communication. We show that the schedules can maximize the time sensor nodes spent on low-power states which helps achieve great energy efficiency, as well as allow fast data aggregation.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125949664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信