Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004.最新文献

筛选
英文 中文
Self-deadlocks in disparate scientific data management systems 不同科学数据管理系统中的自死锁
F. Pentaris, Y. Ioannidis
{"title":"Self-deadlocks in disparate scientific data management systems","authors":"F. Pentaris, Y. Ioannidis","doi":"10.1109/SSDBM.2004.63","DOIUrl":"https://doi.org/10.1109/SSDBM.2004.63","url":null,"abstract":"In large statistical and scientific data management environments, where mediation architectures are used to integrate disparate and autonomous systems, a new problem - self-deadlock - may cause global transaction failures. In this short paper we briefly examine the reasons causing this problem and identify some algorithms for resolving it.","PeriodicalId":383615,"journal":{"name":"Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004.","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114882336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient query processing on relational data-partitioning index structures 关系数据分区索引结构的高效查询处理
H. Kriegel, Peter Kunath, M. Pfeifle, M. Renz
{"title":"Efficient query processing on relational data-partitioning index structures","authors":"H. Kriegel, Peter Kunath, M. Pfeifle, M. Renz","doi":"10.1109/SSDBM.2004.32","DOIUrl":"https://doi.org/10.1109/SSDBM.2004.32","url":null,"abstract":"In contrast to space-partitioning index structures, data-partitioning index structures naturally adapt to the actual data distribution which results in a very good query response behavior. Besides efficient query processing, modern database applications including computer-aided design, medical imaging, or molecular biology require fully-fledged database management systems in order to guarantee industrial-strength. In this paper, we show how we can achieve efficient query processing on data-partitioning index structures within general purpose database systems. We reduce the navigational index traversal cost by using \"extended index range scans\". If a directory node is \"largely\" covered by the actual query, the recursive tree traversal for this node can beneficially be replaced by a scan on the leaf level of the index instead of navigating through the directory any longer. On the other hand, for highly selective queries, the index is used as usual. In this paper, we demonstrate the benefits of this idea for spatial collision queries on the relational R-tree. Our experiments with an Oracle9i database system show that our new approach outperforms common index structures and the sequential scan considerably.","PeriodicalId":383615,"journal":{"name":"Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004.","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122789036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A weight-based map matching method in moving objects databases 一种基于权重的移动目标数据库地图匹配方法
Huabei Yin, O. Wolfson
{"title":"A weight-based map matching method in moving objects databases","authors":"Huabei Yin, O. Wolfson","doi":"10.1109/SSDBM.2004.10","DOIUrl":"https://doi.org/10.1109/SSDBM.2004.10","url":null,"abstract":"In location management, the trajectory represents the motion of a moving object in 3D space-time, i.e., a sequence (x, y, t). Unfortunately, location technologies, cannot guarantee error-freedom. Thus, map matching (a.k.a. snapping), matching a trajectory to the roads on the map, is necessary. We introduce a weight-based map matching method, and experimentally show that, for the offline situation, on average, our algorithm can get up to 94% correctness depending on the GPS sampling interval.","PeriodicalId":383615,"journal":{"name":"Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004.","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125542266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 143
MDDQL-Stat: data querying and analysis through integration of intentional and extensional semantics MDDQL-Stat:通过集成有意语义和扩展语义进行数据查询和分析
E. Kapetanios, David Baer, Björn Glaus, Paul Groenewoud
{"title":"MDDQL-Stat: data querying and analysis through integration of intentional and extensional semantics","authors":"E. Kapetanios, David Baer, Björn Glaus, Paul Groenewoud","doi":"10.1109/SSDBM.2004.49","DOIUrl":"https://doi.org/10.1109/SSDBM.2004.49","url":null,"abstract":"We would like to present a prototype system enabling a rather empirical than a formal approach to the problem of posing queries to a semantically rich (quality aspects, semantic distance, etc.) data integration system {G,S,M} (Global schema, Sources, Mediation) through integration not only of intensional but also of extensional semantics. While the first is provided by an alphabet A as given by an ontology based global schema C, and a high level query language (conjunction/disjunction + inequalities + statistical operations), the latter enables synthesizing of data source specific and previously transformed query results according to well-defined set operations for heterogeneous, distributed data sources. Our approach contrasts with other GAV (Global-As-View) related architectures for mediation of integrated read-only views, in that it simplifies query processing while preserving flexibility when adding new data sources, despite the inherited complexity of mappings due to enhanced semantic description of data (semantic distance, quality parameters, etc.) such that statistical results and comparisons become more meaningful.","PeriodicalId":383615,"journal":{"name":"Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004.","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127332349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
AutoPart: automating schema design for large scientific databases using data partitioning AutoPart:使用数据分区自动化大型科学数据库的模式设计
Stratos Papadomanolakis, A. Ailamaki
{"title":"AutoPart: automating schema design for large scientific databases using data partitioning","authors":"Stratos Papadomanolakis, A. Ailamaki","doi":"10.1109/SSDBM.2004.19","DOIUrl":"https://doi.org/10.1109/SSDBM.2004.19","url":null,"abstract":"Database applications that use multi-terabyte datasets are becoming increasingly important for scientific fields such as astronomy and biology. Scientific databases are particularly suited for the application of automated physical design techniques, because of their data volume and the complexity of the scientific workloads. Current automated physical design tools focus on the selection of indexes and materialized views. In large-scale scientific databases, however the data volume and the continuous insertion of new data allows for only limited indexes and materialized views. By contrast, data partitioning does not replicate data, thereby reducing space requirements and minimizing update overhead. In this paper we present AutoPart, an algorithm that automatically partitions database tables to optimize sequential access assuming prior knowledge of a representative workload. The resulting schema is indexed using a fraction of the space required for indexing the original schema. To evaluate AutoPart we built an automated schema design tool that interfaces to commercial database systems. We experiment with AutoPart in the context of the Sloan Digital Sky Survey database, a real-world astronomical database, running on SQL Server 2000. Our experiments demonstrate the benefits of partitioning for large-scale systems: partitioning alone improves query execution performance by a factor of two on average. Combined with indexes, the new schema also outperforms the indexed original schema by 20% (for queries) and a factor of five (for updates), while using only half the original index space.","PeriodicalId":383615,"journal":{"name":"Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004.","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125102143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 154
A scalable approach to approximating aggregate queries over intermittent streams 在间歇流上近似聚合查询的可伸缩方法
Shanzhong Zhu, C. Ravishankar
{"title":"A scalable approach to approximating aggregate queries over intermittent streams","authors":"Shanzhong Zhu, C. Ravishankar","doi":"10.1109/SSDBM.2004.6","DOIUrl":"https://doi.org/10.1109/SSDBM.2004.6","url":null,"abstract":"We present a novel approach to approximate evaluation of standing aggregate queries over streaming data, subject to user-specified error bounds. Our method models the behavior of aggregates as Brownian motions, and adoptively updates the model according to stream characteristics. This approach has two advantages. First, it greatly improves system scalability since we can defer query evaluation as long as the difference between the returned and true aggregate values remains within user-specified bounds. Second, we are able to provide approximate answers during stream interruptions by estimating the rate at which the streams and the aggregate drift during the blackout periods. We also study processor allocation issues in such approximate aggregate evaluation systems. Our experiments show that our model captures the behavior of real-world streams such as sensor data and stock traces with excellent fidelity, and scales very well for large numbers of standing queries.","PeriodicalId":383615,"journal":{"name":"Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004.","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125762317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A fast algorithm for subspace clustering by pattern similarity 基于模式相似度的子空间聚类快速算法
Haixun Wang, F. Chu, W. Fan, Philip S. Yu, J. Pei
{"title":"A fast algorithm for subspace clustering by pattern similarity","authors":"Haixun Wang, F. Chu, W. Fan, Philip S. Yu, J. Pei","doi":"10.1109/SSDBM.2004.3","DOIUrl":"https://doi.org/10.1109/SSDBM.2004.3","url":null,"abstract":"Unlike traditional clustering methods that focus on grouping objects with similar values on a set of dimensions, clustering by pattern similarity finds objects that exhibit a coherent pattern of rise and fall in subspaces. Pattern-based clustering extends the concept of traditional clustering and benefits a wide range of applications, including large scale scientific data analysis, target marketing, Web usage analysis, etc. However, state-of-the-art pattern-based clustering methods (e.g., the pCluster algorithm) can only handle data sets of thousands of records, which makes them inappropriate for many real-life applications. Furthermore, besides the huge data volume, many data sets are also characterized by their sequentiality, for instance, customer purchase records and network event logs are usually modeled as data sequences. Hence, it becomes important to enable pattern-based clustering methods i) to handle large datasets, and ii) to discover pattern similarity embedded in data sequences. In this paper, we present a novel algorithm that offers this capability. Experimental results from both real life and synthetic datasets prove its effectiveness and efficiency.","PeriodicalId":383615,"journal":{"name":"Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004.","volume":"139 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121960808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
SaIL: a library for efficient application integration of spatial indices SaIL:一个用于空间索引高效应用集成的库
Marios Hadjieleftheriou, E. Hoel, V. Tsotras
{"title":"SaIL: a library for efficient application integration of spatial indices","authors":"Marios Hadjieleftheriou, E. Hoel, V. Tsotras","doi":"10.1109/SSDBM.2004.60","DOIUrl":"https://doi.org/10.1109/SSDBM.2004.60","url":null,"abstract":"Many scientific applications deal with spatial, spatiotemporal and other multidimensional indexing structures, typically managing millions of objects with arbitrary and complex features. Choosing the appropriate method to index such data becomes rather difficult. Having an index library that can combine different indices under the same programming interface is thus very valuable. In this paper we present SalL (SpAtial Index Library), a robust and extensible library that enables simple integration of spatial index structures in existing applications. We mainly focus on design issues and elaborate on techniques for making the framework generic enough, so that it can support user defined data types, customizable spatial queries, and a broad range of spatial (and spatiotemporal) index structures. The library is publicly available and has already been successfully utilized for research and commercial applications.","PeriodicalId":383615,"journal":{"name":"Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004.","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114356075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
HybridTreeMiner: an efficient algorithm for mining frequent rooted trees and free trees using canonical forms HybridTreeMiner:一个使用规范形式挖掘频繁根树和自由树的高效算法
Yun Chi, Yirong Yang, R. Muntz
{"title":"HybridTreeMiner: an efficient algorithm for mining frequent rooted trees and free trees using canonical forms","authors":"Yun Chi, Yirong Yang, R. Muntz","doi":"10.1109/SSDBM.2004.41","DOIUrl":"https://doi.org/10.1109/SSDBM.2004.41","url":null,"abstract":"Tree structures are used extensively in domains such as computational biology, pattern recognition, XML databases, computer networks, and so on. In this paper, we present HybridTreeMiner, a computationally efficient algorithm that discovers all frequently occurring subtrees in a database of rooted unordered trees. The algorithm mines frequent subtrees by traversing an enumeration tree that systematically enumerates all subtrees. The enumeration tree is defined based on a novel canonical form for rooted unordered trees - the breadth-first canonical form (BFCF). By extending the definitions of our canonical form and enumeration tree to free trees, our algorithm can efficiently handle databases of free trees as well. We study the performance of our algorithms through extensive experiments based on both synthetic data and datasets from real applications. The experiments show that our algorithm is competitive in comparison to known rooted tree mining algorithms and is faster by one to two orders of magnitudes compared to a known algorithm for mining frequent free trees.","PeriodicalId":383615,"journal":{"name":"Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004.","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132929912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 146
A shrinking-based dimension reduction approach for multi-dimensional analysis 一种基于收缩的多维分析降维方法
Yong Shi, A. Zhang
{"title":"A shrinking-based dimension reduction approach for multi-dimensional analysis","authors":"Yong Shi, A. Zhang","doi":"10.1109/SSDBM.2004.8","DOIUrl":"https://doi.org/10.1109/SSDBM.2004.8","url":null,"abstract":"In this paper, we present continuous research on data analysis based on our previous work on the shrinking approach. Shrinking is a novel data preprocessing technique which optimizes the inner structure of data inspired by the Newton's Universal Law of Gravitation in the real world. It can be applied in many data mining fields. Following our previous work on the shrinking method for multidimensional data analysis in full data space, we propose a shrinking-based dimension reduction approach which tends to solve the dimension reduction problem from a new perspective. In this approach data are moved along the direction of the density gradient, thus making the inner structure of data more prominent. It is conducted on a sequence of grids with different cell sizes. Dimension reduction process is performed based on the difference of the data distribution projected on each dimension before and after the data-shrinking process. Those dimensions with dramatic variation of data distribution through the data-shrinking process are selected as good dimension candidates for further data analysis. This approach can assist to improve the performance of existing data analysis approaches. We demonstrate how this shrinking-based dimension reduction approach affects the clustering results of well known algorithms.","PeriodicalId":383615,"journal":{"name":"Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004.","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124936116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信