19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)最新文献

筛选
英文 中文
Efficient Approximation of Spatial Network Queries using the M-Tree with Road Network Embedding 基于道路网络嵌入的m树空间网络查询的高效逼近
K. Shaw, Elias Ioup, J. Sample, M. Abdelguerfi, Olivier Tabone
{"title":"Efficient Approximation of Spatial Network Queries using the M-Tree with Road Network Embedding","authors":"K. Shaw, Elias Ioup, J. Sample, M. Abdelguerfi, Olivier Tabone","doi":"10.1109/SSDBM.2007.11","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.11","url":null,"abstract":"Spatial networks, such as road systems, operate differently from normal geospatial systems because objects are constrained to locations on the network. Performing queries on spatial networks demands entirely different solutions. Most spatial queries make use of an R-Tree to process them efficiently. The M-Tree is a data tree index which is capable of indexing data in any metric space. The M-Tree index can replace the R-Tree index for spatial network queries, such as range and KNN queries. The difficulty is that the M-Tree is only as efficient as the distance algorithm used on the underlying objects. Most network distance algorithms, such as A*, are too slow to allow the M-Tree to operate efficiently on spatial networks. The truncated road network embedding (tRNE) maps the network into a higher dimensional space where any LP metric can be used to efficiently compute an accurate approximation of network distance. The M-Tree combined with tRNE creates an efficient index structure for computing spatial network queries. The M-Tree substantially outperforms network expansion, the most popular method of computing spatial network queries, when performing spatial network KNN and range queries.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125414061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
On Exploring Complex Relationships of Correlation Clusters 探讨相关簇的复杂关系
Elke Achtert, C. Böhm, H. Kriegel, Peer Kröger, A. Zimek
{"title":"On Exploring Complex Relationships of Correlation Clusters","authors":"Elke Achtert, C. Böhm, H. Kriegel, Peer Kröger, A. Zimek","doi":"10.1109/SSDBM.2007.21","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.21","url":null,"abstract":"In high dimensional data, clusters often only exist in arbitrarily oriented subspaces of the feature space. In addition, these so-called correlation clusters may have complex relationships between each other. For example, a correlation cluster in a 1-D subspace (forming a line) may be enclosed within one or even several correlation clusters in 2-D superspaces (forming planes). In general, such relationships can be seen as a complex hierarchy that allows multiple inclusions, i.e. clusters may be embedded in several super-clusters rather than only in one. Obviously, uncovering the hierarchical relationships between the detected correlation clusters is an important information gain. Since existing approaches cannot detect such complex hierarchical relationships among correlation clusters, we propose the algorithm ERiC to tackle this problem and to visualize the result by means of a graph-based representation. In our experimental evaluation, we show that ERiC finds more information than state-of-the-art correlation clustering methods and outperforms existing competitors in terms of efficiency.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128050558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 62
Managing Scientific Data: New Challenges for Database Research 管理科学数据:数据库研究的新挑战
M. Winslett
{"title":"Managing Scientific Data: New Challenges for Database Research","authors":"M. Winslett","doi":"10.1109/SSDBM.2007.18","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.18","url":null,"abstract":"The database research community's appetite for new applications has led to increased interest in the data management needs of scientists. This area encompasses a huge range of applications, extending from public repositories of observational data such as the popular Sloan Digital Sky Survey to one-of-a-kind runs of simulation codes crafted by individual scientists. In this talk, we will survey the most common data management needs found in the hard sciences, describe the new database research challenges that arise from these needs, and outline ways to address some of these challenges.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116466770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MonetDB/SQL Meets SkyServer: the Challenges of a Scientific Database MonetDB/SQL遇上SkyServer:科学数据库的挑战
M. Ivanova, N. Nes, R. Goncalves, M. Kersten
{"title":"MonetDB/SQL Meets SkyServer: the Challenges of a Scientific Database","authors":"M. Ivanova, N. Nes, R. Goncalves, M. Kersten","doi":"10.1109/SSDBM.2007.19","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.19","url":null,"abstract":"This paper presents our experiences in porting the Sloan Digital Sky Survey(SDSS)/ SkyServer to the state-of- the-art open source database system MonetDB/SQL. SDSS acts as a well-documented benchmark for scientific database management. We have achieved a fully functional prototype for the personal SkyServer, to be downloaded from our site. The lessons learned are 1) the column store approach of MonetDB demonstrates a great potential in the world of scientific databases. However, the application also challenged the functionality of our implementation and revealed that a fully operational SQL environment is needed, e.g. including persistent stored modules; 2) the initial performance is competitive to the reference platform, MS SQL Server 2005, and 3) the analysis of SDSS query traces hints at several techniques to boost performance by utilizing repetitive behavior and zoom-in/zoom-out access patterns, that are currently not captured by the system.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128815029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 60
Effective Summarization of Multi-Dimensional Data Streams for Historical Stream Mining 面向历史流挖掘的多维数据流的有效汇总
Samer Nassar, J. Sander
{"title":"Effective Summarization of Multi-Dimensional Data Streams for Historical Stream Mining","authors":"Samer Nassar, J. Sander","doi":"10.1109/SSDBM.2007.32","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.32","url":null,"abstract":"We consider the following problem: given a very large data stream, a limited space to encode the stream, and a compression technique to compress the stream, retain the most important information from the distant past of the stream while at the same time retain high quality of the compressed information that is in the recent part of the stream to perform temporal analysis of the summarized information. Simple schemes for accumulating micro-clustering summaries of stream windows that have been previously proposed are very ineffective for solving this challenging task. We overcome the limitations of these schemes by first identifying spatial summaries that compress \"similar' regions in the data space, and reduce their space consumption using novel approximate spatio-temporal summaries. Second, we present policies for effectively utilizing the space budget and managing these novel approximate spatio-temporal summaries.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131866442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Efficient Indexing of Heterogeneous Data Streams with Automatic Performance Configurations 具有自动性能配置的异构数据流的高效索引
K. Pu, Ying Zhu
{"title":"Efficient Indexing of Heterogeneous Data Streams with Automatic Performance Configurations","authors":"K. Pu, Ying Zhu","doi":"10.1109/SSDBM.2007.33","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.33","url":null,"abstract":"We study the problem of indexing continuous data streams in which data are heterogeneous in structure. Such data streams arise naturally in many real-life scenarios such as sensor networks. Our index structure uses bitmap based techniques to efficiently sketch the structures to allow space-efficient lossless archiving of the data stream. It also allows very fast query processing on the archived data stream. Furthermore, our index structure adapts to structural evolutions of the stream to ensure good indexing and querying performance both in space and time. We developed a cost-based optimization framework so the indexing engine adjusts its configuration at run-time to adapt to changes in the data stream. By means of linear feedback controllers, structural clustering and steepest gradient ascent optimization, our indexing engine can achieve excellent performance without any human intervention.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122246789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
On Efficient Processing of Subspace Skyline Queries on High Dimensional Data 高维数据上子空间Skyline查询的高效处理
Wen Jin, A. Tung, M. Ester, Jiawei Han
{"title":"On Efficient Processing of Subspace Skyline Queries on High Dimensional Data","authors":"Wen Jin, A. Tung, M. Ester, Jiawei Han","doi":"10.1109/SSDBM.2007.20","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.20","url":null,"abstract":"Recent studies on efficiently answering subspace skyline queries can be separated into two approaches. The first focused on pre-materializing a set of skylines points in various subspaces while the second focus on dynamically answering the queries by using a set of anchors to prune off skyline points through spatial reasoning. Despite effort to compress the pre-materialized subspace skylines through removal of redundancy, the storage space for the first approach remain exponential in the number of dimensions. The query time for the second approach on the other hand also grow substantially for data with higher dimensionality where the pruning power of anchors become much weaker. In this paper, we propose methods for answering subspace skyline query on high dimensional data such that both prematerialization storage and query time can be moderated. We propose novel notions of maximal partial-dominating space, maximal partial-dominated space and the maximal equality space between pairs of skyline objects in the full space and use these concepts as the foundation for answering subspace skyline queries for high dimensional data. Query processing involves mostly simple pruning operations while skyline computation is done only on a small subset of candidate skyline points in the subspace. We also develop a random sampling method to compute the subspace skyline in an on-line fashion. Extensive experiments have been conducted and demonstrated the efficiency and effectiveness of our methods.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"49 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130293608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
Incorporating Uncertainty Metrics into a General-Purpose Data Integration System 将不确定性度量纳入通用数据集成系统
Brenton Louie, L. Detwiler, Nilesh N. Dalvi, Ron Shaker, P. Tarczy-Hornoch, Dan Suciu
{"title":"Incorporating Uncertainty Metrics into a General-Purpose Data Integration System","authors":"Brenton Louie, L. Detwiler, Nilesh N. Dalvi, Ron Shaker, P. Tarczy-Hornoch, Dan Suciu","doi":"10.1109/SSDBM.2007.36","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.36","url":null,"abstract":"There is a significant need for data integration capabilities in the scientific domain, which has manifested itself as products in the commercial world as well as academia. However, in our experiences in dealing with biological data it has become apparent to us that existing data integration products do not handle uncertainties in the data very well. This leads to systems that often produce an explosion of less relevant answers which subsequently leads to a loss of more relevant answers by overloading the user. How to incorporate functionality into data integration systems to properly handle uncertainties and make results more useful has become an important research question. In this paper we describe an enhanced general-purpose data integration system which incorporates uncertainty metrics within a formal probabilistic framework. Additionally, for evaluation purposes, we have implemented a use case scenario which utilizes biological data sources and performed a study which provides validation of system query results.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"182 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132128525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Window-Oblivious Join: A Data-Driven Memory Management Scheme for Stream Join 窗口无关连接:一种数据驱动的流连接内存管理方案
Ji Wu, K. Tan, Yongluan Zhou
{"title":"Window-Oblivious Join: A Data-Driven Memory Management Scheme for Stream Join","authors":"Ji Wu, K. Tan, Yongluan Zhou","doi":"10.1109/SSDBM.2007.43","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.43","url":null,"abstract":"Memory management is a critical issue in stream processing involving stateful operators such as join. Traditionally, the memory requirement for a stream join is query-driven: a query has to explicitly define a window for each (potentially unbounded) input. The window essentially bounds the size of the buffer allocated for that stream. However, outputs produced by such approach may not be desirable (if the window size is not part of the intended query semantic) due to the volatile input characteristics. We discover that when streams are ordered or partially ordered, it is possible to use a data-driven memory management scheme for improved performance. In this work, we present a novel data-driven memory management scheme, called Window-Oblivious Join (WO-Join), which adaptively adjusts the state buffer size according to the input characteristics. Our performance study shows that, compared to traditional Window-Join (W-Join), WO-Join is more robust with respect to the dynamic inputs and therefore produces higher quality results with lower memory costs.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"180 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122592825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Adaptive Wavelet Density Estimators over Data Streams 数据流上的自适应小波密度估计
C. Heinz, B. Seeger
{"title":"Adaptive Wavelet Density Estimators over Data Streams","authors":"C. Heinz, B. Seeger","doi":"10.1109/SSDBM.2007.28","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.28","url":null,"abstract":"A variety of scientific and commercial applications requires an immediate analysis of transient data streams. Many approaches for analyzing data share the property that an estimation of the underlying data distribution is used as a fundamental building block. To estimate the density of a continuous data distribution, wavelet density estimation, a technique from the area of nonparametric statistics, is very appealing as it is theoretically well-founded and practically approved. For that reason, its application to data streams is highly promising; it provides a convenient way to analyze the characteristics of a stream. However, the heavy computational cost of wavelet density estimators renders their direct application to the streaming scenario impossible. In this work, we tackle this problem and present a novel approach to adaptive wavelet density estimators over data streams. Not only do our estimators meet the rigid processing requirements for data streams, they also adapt to changing system resources in a well-defined manner. A thorough experimental evaluation demonstrates the efficacy of our wavelet density estimators and shows their superiority to competing kernel- and histogram-based estimators.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115215124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信