19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)最新文献

筛选
英文 中文
Adaptive-Size Reservoir Sampling over Data Streams 数据流上自适应大小的储层采样
Mohammed Al-Kateb, B. Lee, X. Wang
{"title":"Adaptive-Size Reservoir Sampling over Data Streams","authors":"Mohammed Al-Kateb, B. Lee, X. Wang","doi":"10.1109/SSDBM.2007.29","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.29","url":null,"abstract":"Reservoir sampling is a well-known technique for sequential random sampling over data streams. Conventional reservoir sampling assumes a fixed-size reservoir. There are situations, however, in which it is necessary and/or advantageous to adaptively adjust the size of a reservoir in the middle of sampling due to changes in data characteristics and/or application behavior. This paper studies adaptive size reservoir sampling over data streams considering two main factors: reservoir size and sample uniformity. First, the paper conducts a theoretical study on the effects of adjusting the size of a reservoir while sampling is in progress. The theoretical results show that such an adjustment may bring a negative impact on the probability of the sample being uniform (called uniformity confidence herein). Second, the paper presents a novel algorithm for maintaining the reservoir sample after the reservoir size is adjusted such that the resulting uniformity confidence exceeds a given threshold. Third, the paper extends the proposed algorithm to an adaptive multi-reservoir sampling algorithm for a practical application in which samples are collected from memory-limited wireless sensor networks using a mobile sink. Finally, the paper empirically examines the adaptivity of the multi-reservoir sampling algorithm with regard to reservoir size and sample uniformity using real sensor networks data sets.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116907025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 57
A Distributed Algorithm for Joins in Sensor Networks 传感器网络中的分布式连接算法
Alexandru Coman, M. Nascimento
{"title":"A Distributed Algorithm for Joins in Sensor Networks","authors":"Alexandru Coman, M. Nascimento","doi":"10.1109/SSDBM.2007.26","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.26","url":null,"abstract":"Given their autonomy, flexibility and large range of functionality, wireless sensor networks can be used as an effective and discrete means for monitoring data in many domains. Typical sensor nodes are very constrained, in particular regarding their energy and memory resources. Thus, any query processing solution over these devices should consider their limitations. We investigate the problem of processing join queries within a sensor network. Due to the limited memory at nodes, joins are typically processed in a distributed manner over a set of nodes. Previous approaches have either assumed that the join processing nodes have sufficient memory to buffer the subset of the join relations assigned to them, or that the amount of available memory at nodes is known in advance. These assumptions are not realistic for most scenarios. In this context we propose and investigate DIJ, a distributed algorithm for join processing that considers the memory limitations at nodes and does not make a priori assumptions on the available memory at the processing nodes. At the same time, our algorithm still aims at minimizing the energy cost of query processing.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122570212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Reliable Hierarchical Data Storage in Sensor Networks 传感器网络中可靠的分层数据存储
Song Lin, Benjamin Arai, D. Gunopulos
{"title":"Reliable Hierarchical Data Storage in Sensor Networks","authors":"Song Lin, Benjamin Arai, D. Gunopulos","doi":"10.1109/SSDBM.2007.39","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.39","url":null,"abstract":"The ability to provide reliable in-network storage while balancing the energy consumption of individual sensors is a primary concern when deploying a sensor network. The main concern with data-centric storage in sensor networks is the ability to provide reliable and load balanced storage. Energy and wireless range constraints make centralized approaches for storage impractical, and in-network data-centric solutions can be used to reduce the number of messages sent over the network. However, these solutions quickly become expensive when combined with fault- tolerance, load balancing and routing. In this paper, we present a novel data-centric storage and query routing mechanism for sensor networks. The routing mechanism is constructed upon the neighborhood information of individual sensors and is completely independent of geographical information. Our data resilient algorithm is capable of recovering from multiple simultaneous failures in the network while adaptively adjusting the load distribution of the newly generated sensor data. Comprehensive experiments on both real-world and synthetic data sets indicate that our approach is more effective and efficient than the previously proposed solutions.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121207853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Information-Aware 2^n-Tree for Efficient Out-of-Core Indexing of Very Large Multidimensional Volumetric Data 基于信息感知的2^n-树的超大多维体积数据的高效外核索引
Jusub Kim, J. JáJá
{"title":"Information-Aware 2^n-Tree for Efficient Out-of-Core Indexing of Very Large Multidimensional Volumetric Data","authors":"Jusub Kim, J. JáJá","doi":"10.1109/SSDBM.2007.15","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.15","url":null,"abstract":"We discuss a new efficient out-of-core multidimensional indexing structure, information-aware 2n-tree, for indexing very large multidimensional volumetric data. Building a series of (n-1)-Dimensional indexing structures on n-Dimensional data causes a scalability problem in the situation of continually growing resolution in every dimension. However, building a single n-Dimensional indexing structure can cause an indexing effectiveness problem compared to the former case. The information-aware 2n-tree is an effort to maximize the indexing structure efficiency by ensuring that the subdivision of space have as similar coherence as possible along each dimension. It is particularly useful when data distribution along each dimension constantly shows a different degree of coherence from each other dimension. Our preliminary results show that our new tree can achieve higher indexing structure efficiency than previous methods.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124519387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
MAMCost: Global and Local Estimates leading to Robust Cost Estimation of Similarity Queries MAMCost:全局和局部估计导致相似查询的鲁棒成本估计
Gisele Busichia Baioco, A. Traina, C. Traina
{"title":"MAMCost: Global and Local Estimates leading to Robust Cost Estimation of Similarity Queries","authors":"Gisele Busichia Baioco, A. Traina, C. Traina","doi":"10.1109/SSDBM.2007.17","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.17","url":null,"abstract":"This paper presents an effective cost model to estimate the number of disk accesses (I/O cost) and the number of distance calculations (CPU cost) to process similarity queries over data indexed by metric access methods. Two types of similarity queries were taken into consideration: range and k-nearest neighbor queries. The main point of the cost model is considering not only global parameters of the data set but also the local data distribution. The model takes advantage of the intrinsic dimension of the data set, estimated by its correlation fractal dimension. Experiments were performed on real and synthetic data sets, with different sizes and dimensions, in order to validate the proposed model. They confirmed that the estimations are accurate, within the range achieved by real queries.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121907218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Cost-based Optimization of Complex Scientific Queries 基于成本的复杂科学查询优化
R. Fomkin, T. Risch
{"title":"Cost-based Optimization of Complex Scientific Queries","authors":"R. Fomkin, T. Risch","doi":"10.1109/SSDBM.2007.8","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.8","url":null,"abstract":"High energy physics scientists analyze large amounts of data looking for interesting events when particles collide. These analyses are easily expressed using complex queries that filter events. We developed a cost model for aggregation operators and other functions used in such queries and show that it substantially improves performance. However, the query optimizer still produces suboptimal plans because of estimate errors. Furthermore, the optimization is very slow because of the large query size. We improved the optimization by a profiled grouping strategy where the scientific query is first automatically fragmented into subqueries based on application knowledge. Each fragment is then independently profiled on a sample of events to measure real execution cost and cardinality. An optimized fragmented query is shown to execute faster than a query optimized with the cost model alone. Furthermore, the total optimization time, including fragmentation and profiling, is substantially improved.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129243602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Update Conscious Bitmap Indices 更新有意识位图索引
G. Canahuate, Michael Gibas, H. Ferhatosmanoğlu
{"title":"Update Conscious Bitmap Indices","authors":"G. Canahuate, Michael Gibas, H. Ferhatosmanoğlu","doi":"10.1109/SSDBM.2007.24","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.24","url":null,"abstract":"Bitmap indices have been widely used in several domains such as data warehousing and scientific applications due to their efficiency in answering certain query types over large data sets. However, their utilization has been largely limited to read-only data sets or to static snapshots of data due to the cost associated with the update and append of new data. Typically, several bitmaps are associated with each indexed attribute in a table, i.e. one for each attribute value, bin, or range. Each one of these bitmaps needs to be updated to reflect a new, appended row. Since a given table could be represented by hundreds or even thousands of bitmaps, the insertion of a single record can be prohibitively costly. In order to transfer the fast query response times offered by bitmap indices to dynamic database domains, we propose an update conscious bitmap index that provides a mechanism to quickly update bitmaps to reflect dynamic database changes. For an insert operation only the bitmaps that represent the values being inserted need to be updated. We formalize the insert and delete operations of the proposed technique and provide a cost model for bitmap updates. We compare the update conscious bitmaps to traditional bitmaps in terms of storage space, update performance, and query execution time.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"147 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128836061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Gene Ontology-Based Annotation Analysis and Categorization of Metabolic Pathways 基于基因本体论的代谢途径注释分析与分类
A. Cakmak
{"title":"Gene Ontology-Based Annotation Analysis and Categorization of Metabolic Pathways","authors":"A. Cakmak","doi":"10.1109/SSDBM.2007.35","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.35","url":null,"abstract":"Functional characterizations of pathways provide new opportunities in defining, understanding, and comparing existing biological pathways, and in helping discover new ones in different organisms. In this paper, we present and evaluate computational techniques for categorizing pathways, based upon the Gene Ontology (GO) annotations of enzymes within metabolic pathways. Our approach is to use the notion of functionality templates, GO-functional graphs of pathways. Pathway categorization is then achieved through learning models built on different characteristics of functionality templates. We have experimentally evaluated the accuracy of automated pathway categorization with respect to different learning models and their parameters. Using KEGG metabolic pathways, the pathway categorization tool reaches to 90% and higher accuracy.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"132 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131555911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Mining RNA Tertiary Motifs with Structure Graphs 利用结构图挖掘RNA三级基序
Xueyi Wang, Jun Huan, J. Snoeyink, Wei Wang
{"title":"Mining RNA Tertiary Motifs with Structure Graphs","authors":"Xueyi Wang, Jun Huan, J. Snoeyink, Wei Wang","doi":"10.1109/SSDBM.2007.38","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.38","url":null,"abstract":"We present a novel application of graph database mining to identify tertiary motifs in RNA structures. In our method, we abstract an RNA molecule as a labeled graph and use a frequent subgraph mining technique to derive tertiary motifs. By applying our technique to ribosome RNA and transfer RNA, we have identified known RNA tertiary motifs such as the ribose zipper and U-turn, plus candidates for novel tertiary motifs. Finally, we suggest an iterative multiple structure alignment algorithm to classify tertiary motifs and generate consensus motifs.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134645352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Maintaining K-Anonymity against Incremental Updates 维护k -匿名对抗增量更新
J. Pei, Jian Xu, Zhibin Wang, Wei Wang, Ke Wang
{"title":"Maintaining K-Anonymity against Incremental Updates","authors":"J. Pei, Jian Xu, Zhibin Wang, Wei Wang, Ke Wang","doi":"10.1109/SSDBM.2007.16","DOIUrl":"https://doi.org/10.1109/SSDBM.2007.16","url":null,"abstract":"K-anonymity is a simple yet practical mechanismto protect privacy against attacks of re-identifying individuals by joining multiple public data sources. All existing methods achieving k-anonymity assume implicitly that the data objects to be anonymized are given once and fixed. However, in many applications, the real world data sources are dynamic. In this paper, we investigate the problem of maintaining k-anonymity against incremental updates, and propose a simple yet effective solution. We analyze how inferences from multiple releases may temper the k-anonymity of data, and propose the monotonic incremental anonymization property. The general idea is to progressively and consistently reduce the generalization granularity as incremental updates arrive. Our new approach guarantees the k-anonymity on each release, and also on the inferred table using multiple releases. At the same time, our new approach utilizes the more and more accumulated data to reduce the information loss.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124955328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 99
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信