M. Kafatos, X. Wang, Zuotao Li, Ruixin Yang, D. Ziskin
{"title":"Information technology implementation for a distributed data system serving Earth scientists: seasonal to interannual ESIP","authors":"M. Kafatos, X. Wang, Zuotao Li, Ruixin Yang, D. Ziskin","doi":"10.1109/SSDM.1998.688126","DOIUrl":"https://doi.org/10.1109/SSDM.1998.688126","url":null,"abstract":"We address the implementation of a distributed data system designed to serve Earth system scientists. A consortium led by George Mason University has been funded by NASA's Working Prototype Earth Science Information Partner (WP-ESIP) program to develop, implement, and operate a distributed data and information system. The system will address the research needs of seasonal to interannual scientists whose research focus includes phenomena such as El Nino, monsoons and associated climate studies. The system implementation involves several institutions using a multitiered client-server architecture. Specifically the consortium involves an information system of three physical sites, GMU, the Center for Ocean-Land-Atmosphere Studies (COLA) and the Goddard Distributed Active Archive Center, distributing tasks in the areas of user services, access to data, archiving, and other aspects enabled by a low-cost, scalable information technology implementation. The project can serve as a model for a larger WP-ESIP Federation to assist in the overall data information system associated with future large Earth Observing System data sets and their distribution. The consortium has developed innovative information technology techniques such as content based browsing, data mining and associated component working prototypes; analysis tools particularly GrADS developed by COLA, the preferred analysis tool of the working seasonal to interannual communities; and a Java front-end query engine working prototype.","PeriodicalId":120937,"journal":{"name":"Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115119917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Computational issues connected with the protection of sensitive statistics by auditing sum-queries","authors":"F. M. Malvestuto, M. Moscarini","doi":"10.1109/SSDM.1998.688118","DOIUrl":"https://doi.org/10.1109/SSDM.1998.688118","url":null,"abstract":"An implementation of the auditing strategy is presented to avoid both exact and approximate disclosure. The key data structure is a query map, which is a graphical summary of answered queries. Since the size of a query map may be exponential in the number of answered queries, a query-restriction criterion is introduced to make every query map a graph. An auditing procedure on such a graph is presented and the computational issues connected with its implementation are discussed. All the computational tasks can be carried out efficiently but one, which is a provably intractable problem.","PeriodicalId":120937,"journal":{"name":"Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128790850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Attribute uncertainty propagation in vector geographic information systems: sensitivity analysis","authors":"O. Bonin","doi":"10.1109/SSDM.1998.688134","DOIUrl":"https://doi.org/10.1109/SSDM.1998.688134","url":null,"abstract":"This paper presents a geographical sensitivity analysis on a vector road database. It consists in introducing controlled noise to the database and in studying the effects of this noise on the results of a chosen application. The objective is to give users the means to evaluate the accuracy of their application results for given quality parameters.","PeriodicalId":120937,"journal":{"name":"Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134105658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling multidimensional databases, cubes and cube operations","authors":"Panos Vassiliadis","doi":"10.1109/SSDM.1998.688111","DOIUrl":"https://doi.org/10.1109/SSDM.1998.688111","url":null,"abstract":"Online analytical processing (OLAP) is a trend in database technology, which has attracted the interest of a lot of research work. OLAP is based on the multidimensional view of data, supported either by multidimensional databases (MOLAP) or relational engines (ROLAP). We propose a model for multidimensional databases. Dimensions, dimension hierarchies and cubes are formally introduced. We also introduce cube operations (changing of levels in the dimension hierarchy, function application, navigation etc.). The approach is based on the notion of the base cube, which is used for the calculation of the results of cube operations. We focus our approach on the support of a series of operations on cubes (i.e., the preservation of the results of previous operations and the applicability of aggregate functions in a series of operations). Furthermore, we provide a mapping of the multidimensional model to the relational model and to multidimensional arrays.","PeriodicalId":120937,"journal":{"name":"Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115385037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An extensible framework for spatio-temporal database applications","authors":"Glaucia Faria, C. B. Medeiros, M. Nascimento","doi":"10.1109/SSDM.1998.688124","DOIUrl":"https://doi.org/10.1109/SSDM.1998.688124","url":null,"abstract":"There is a wide range of scientific applications requiring sophisticated management of spatio-temporal data. However existing database management systems offer very limited support for managing such data. Thus, it is left to the researchers themselves to repeatedly code this management into each application. We present an extensible framework, based on extending an object-oriented database system, with kernel spatio-temporal classes, data structures and functions, to provide support for the development of spatio-temporal applications.","PeriodicalId":120937,"journal":{"name":"Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115417702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ConIstat: a system to manage, record and present cyclical data","authors":"A. Sorce, F. Rizzo","doi":"10.1109/SSDM.1998.688129","DOIUrl":"https://doi.org/10.1109/SSDM.1998.688129","url":null,"abstract":"The paper discusses ConIstat, a system to manage and present cyclical data organised in historical series to traditional and untraditional users. The data bank contains about 6300 time series, and is organized with the following dominions: external trade, invoiced, consistencies and ordered, production prices, index of work of the great enterprises, industrial production, contractual wages and salaries.","PeriodicalId":120937,"journal":{"name":"Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117203804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
O. Günther, Vincent Oria, P. Picouet, J. Saglio, M. Scholl
{"title":"Benchmarking spatial joins a la carte","authors":"O. Günther, Vincent Oria, P. Picouet, J. Saglio, M. Scholl","doi":"10.1109/SSDM.1998.688109","DOIUrl":"https://doi.org/10.1109/SSDM.1998.688109","url":null,"abstract":"Spatial joins are join operations that involve spatial data types and operators. Spatial access methods are often used to speed up the computation of spatial joins. We address the issue of benchmarking spatial join operations. For this purpose, we first present a WWW-based benchmark generator to produce sets of rectangles. Using a Web browser experimenters can specify the number of rectangles in a sample, as well as the statistical distributions of their sizes, shapes, and locations. Second, using the generator and a well-defined set of statistical models we define several tests to compare the performance of three spatial join algorithms: nested loop, scan-and-index, and synchronized tree traversal. We also added a real-life data set from the Sequoia 2000 storage benchmark. Our results show that the relative performance of the different techniques mainly depends on two parameters: sample size, and selectivity of the join predicate. All of the statistical models and algorithms are available on the Web, which allows for easy verification and modification of our experiments.","PeriodicalId":120937,"journal":{"name":"Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125126154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Barry Zeeberg, Kevin Watanabe, S. Goto, R. Overbeek, L. Kerschberg, George Michaels
{"title":"Metabolic pathway interface to molecular biology databases","authors":"Barry Zeeberg, Kevin Watanabe, S. Goto, R. Overbeek, L. Kerschberg, George Michaels","doi":"10.1109/SSDM.1998.688132","DOIUrl":"https://doi.org/10.1109/SSDM.1998.688132","url":null,"abstract":"We present results of providing database support to biomedicine via federation of SDB Cooperation/Integration based upon the KEGG GUI for molecular biology. The federation provides a common link to three molecular biology databases. The added value of the federation is freedom from consulting multiple references to ascertain the full set of enzymatic reactions in a metabolic pathway, and the option of selecting multiple queries to submit to the federated SDBs. Each of the SDBs is extensive, but incomplete. The union of the SDBs, implemented transparently by the federation, is more complete. Each SDB provides a different approach to the options available for data presentation and a different set of Web server tools for data analysis. Thus, an important part of the added value of the federation is the cross-fertilization available in the union of the molecular biological content, the presentation of data, and the tools available for analysis.","PeriodicalId":120937,"journal":{"name":"Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126257012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A pyramid data model for supporting content-based browsing and knowledge discovery","authors":"Zuotao Li, X. Wang, M. Kafatos, Ruixin Yang","doi":"10.1109/SSDM.1998.688121","DOIUrl":"https://doi.org/10.1109/SSDM.1998.688121","url":null,"abstract":"Remote sensing from space can provide global and continuous observations. The associated measurement data need to be stored and studied to understand the Earth system processes. The ability of interactive content-based browsing, i.e., browsing or searching the content to narrow-down the interesting portions of data sets prior to actually accessing or ordering full data sets, is highly desirable for any Earth science data information system. However the large volumes of archived and future Earth science remote sensing data are clearly a serious challenge for an interactive browsing process. In this paper a pyramid data model is introduced to support interactive content-based browsing and knowledge discovery for a wide variety of Earth science remote sensing data sets. By using multi-level precomputation and robust nonparametric approximation procedures, the interactive browsing performance can be enhanced greatly. An initial implementation and testing of this data model has been carried out through our research prototype system, Virtual Domain Application Data Center (VDADC). Future implementations are planned for our Seasonal to Interannual Earth Science Information Partner (SIESIP) project.","PeriodicalId":120937,"journal":{"name":"Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127732670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Determining the optimal file size on tertiary storage systems based on the distribution of query sizes","authors":"L. Bernardo, H. Nordberg, D. Rotem, A. Shoshani","doi":"10.1109/SSDM.1998.688108","DOIUrl":"https://doi.org/10.1109/SSDM.1998.688108","url":null,"abstract":"In tertiary storage systems, the data is stored on multiple tape volumes where each tape is further divided into files. Since in many such systems the minimum unit of data transfer is a file, it is an important problem to match file sizes with the access patterns to the data. In general, if the file size is large relative to the query size it will lead to the transfer of large amounts of irrelevant data whereas small file sizes will incur an overhead penalty associated with reading each new file. In this work, we analyze the relationship between file sizes and query response times and provide a methodology to compute the optimal file size given information about the distribution of query sizes. Exact closed form solutions for the cost function are given for two common distributions.","PeriodicalId":120937,"journal":{"name":"Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126687077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}