Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150)最新文献

Beyond interoperability-tracking and managing the results of computational applications 超越互操作性——跟踪和管理计算应用程序的结果

Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150) Pub Date : 1997-08-11 DOI: 10.1109/SSDM.1997.621191

J. Cushing, J. Laird, E. Pasalic, E. Kutter, T. Hunkapiller, F. Zucker, D. Yee

{"title":"Beyond interoperability-tracking and managing the results of computational applications","authors":"J. Cushing, J. Laird, E. Pasalic, E. Kutter, T. Hunkapiller, F. Zucker, D. Yee","doi":"10.1109/SSDM.1997.621191","DOIUrl":"https://doi.org/10.1109/SSDM.1997.621191","url":null,"abstract":"Molecular biology applications, like those of other scientific domains, need to store and view large amounts of specialized quantitative information. With the advent of high speed sequencing technology and considerable funding to \"map\" the genomes of key biological organisms, public databases such as GenBank, PDB, EMBL, JIPID, and SwissProt make millions of genetic sequences available to molecular biologists, and industry and university laboratories maintain large databases. The need for common interfaces and query languages to exploit these heterogeneous databases is well documented, and several such systems now exist or are under development. The authors' own work on database and program interoperability in this domain has shown, however, that providing an interface is but a first step towards making these databases fully useful. The system they are developing integrates and trades inputs and results from numerous computational biology programs. It helps researchers organize result items from sequence comparisons into \"clusters\" that can be marked, named, annotated, and manipulated. An alpha version is implemented in Smalltalk. The paper describes the scientific problem the system aims to solve, as well as current barriers to development and research opportunities suggested by those barriers. They present its conceptual data model, the current prototype, and future implementation plans.","PeriodicalId":159935,"journal":{"name":"Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150)","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121459124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A simple structure for statistical meta-data 统计元数据的简单结构

Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150) Pub Date : 1997-08-11 DOI: 10.1109/SSDM.1997.621188

A. Westlake

引用次数: 4

Developing and accessing scientific databases with the Object-Protocol Model (OPM) data management tools 使用对象协议模型(OPM)数据管理工具开发和访问科学数据库

Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150) Pub Date : 1997-08-11 DOI: 10.1109/SSDM.1997.621167

I. Chen, A. Kosky, V. Markowitz, E. Szeto

引用次数: 0

Security problems for statistical databases with general cell suppressions 具有一般单元格抑制的统计数据库的安全性问题

Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150) Pub Date : 1997-08-11 DOI: 10.1109/SSDM.1997.621180

T. Hsu, M. Kao

引用次数: 11

The S-PLUS DataBlade for INFORMIX-Universal Server. The natural wedding of an object relational database with an object-oriented data analysis engine 用于INFORMIX-Universal服务器的S-PLUS数据库。对象关系数据库与面向对象数据分析引擎的自然结合

Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150) Pub Date : 1997-08-11 DOI: 10.1109/SSDM.1997.621173

R. D. Martin, V. Chalana

引用次数: 1

For scientific data discovery: why can't the archive be more like the Web? 对于科学数据发现:为什么存档不能更像网络?

Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150) Pub Date : 1997-08-11 DOI: 10.1109/SSDM.1997.621160

T. Hinke, J. Rushing, Shalini Kansal, S. Graves, H. Ranganath

{"title":"For scientific data discovery: why can't the archive be more like the Web?","authors":"T. Hinke, J. Rushing, Shalini Kansal, S. Graves, H. Ranganath","doi":"10.1109/SSDM.1997.621160","DOIUrl":"https://doi.org/10.1109/SSDM.1997.621160","url":null,"abstract":"The paper addresses the problem of acquiring from scientific data, metadata that is descriptive of the actual content of the data. Scientists can use this content based metadata in subsequent archive searches to find data sets of interest. Such metadata would be especially useful in large scientific archives such as NASA's Earth Observing System Data and Information System (EOSDIS). The paper presents two generic approaches for content based metadata acquisition: target dependent and target independent. Both of these approaches are oriented toward characterizing datasets in terms of the scientific phenomena, such as mesoscale convective systems (severe storms) that they contain. In the target dependent approach, the archived data is mined for particular phenomena of interest and polygons representing the phenomena are stored in a spatial database where they can be used in the data search process. In the target independent approach, data is initially mined for deviations from normal and for trends. This data can then be used for subsequent searches for particular transient phenomena using the deviation data, or for phenomena related to trends. The paper describes results from implementing both of these approaches.","PeriodicalId":159935,"journal":{"name":"Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150)","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114659074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Data mining and modeling in scientific databases 科学数据库中的数据挖掘与建模

Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150) Pub Date : 1997-08-11 DOI: 10.1109/SSDM.1997.621146

E. Kapetanios, M. Norrie

引用次数: 0

Constructing and maintaining scientific database views in the framework of the object-protocol model 在对象-协议模型框架下构建和维护科学的数据库视图

Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150) Pub Date : 1997-08-11 DOI: 10.1109/SSDM.1997.621192

I. Chen, A. Kosky, V. Markowitz, E. Szeto

{"title":"Constructing and maintaining scientific database views in the framework of the object-protocol model","authors":"I. Chen, A. Kosky, V. Markowitz, E. Szeto","doi":"10.1109/SSDM.1997.621192","DOIUrl":"https://doi.org/10.1109/SSDM.1997.621192","url":null,"abstract":"Scientific databases (ScDBs) are used to archive and retrieve data describing objects of scientific inquiry. Since these ScDBs must provide continuous and efficient access to large communities of scientists, they are often developed with reliable commercial relational database management systems (DBMSs) or file systems. However, relational DBMSs and flat files do not provide constructs for representing directly ScDB-specific objects and experimental procedures, and therefore they are often hard to develop, maintain, and explore. The authors present a retrofitting tool for constructing and maintaining ScDB views using an object-oriented data model, and describe their experience with retrofitting ScDBs that have been originally developed using relational DBMSs and file systems. The retrofitting tool is part of a data management toolkit based on the object-protocol model (OPM). The OPM toolkit provides facilities for developing databases defined using OPM and for querying and browsing such ScDBs in terms of OPM constructs. The OPM retrofitting tool allows constructing (one or several) OPM views for ScDBs that have not been originally developed with the OPM tools. ScDBs with native OPM schemas or retrofitted OPM views can be browsed and queried via OPM interfaces, reorganized, or incorporated into an OPM-based database federation.","PeriodicalId":159935,"journal":{"name":"Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134437481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 35

Parallel input/output with heterogeneous disks 异构磁盘并行输入/输出

Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150) Pub Date : 1997-08-11 DOI: 10.1109/SSDM.1997.621154

S. Kuo, M. Winslett, Ying Chen, Yong Cho, M. Subramaniam, K. Seamons

{"title":"Parallel input/output with heterogeneous disks","authors":"S. Kuo, M. Winslett, Ying Chen, Yong Cho, M. Subramaniam, K. Seamons","doi":"10.1109/SSDM.1997.621154","DOIUrl":"https://doi.org/10.1109/SSDM.1997.621154","url":null,"abstract":"Panda is a high performance library for accessing large multidimensional array data on secondary storage of parallel platforms and networks of workstations. When using Panda as the I/O component of a scientific application, H3expresso, on the IBM SP2 at Cornell Theory Center, we found that some nodes are more powerful with respect to I/O than others, requiring the introduction of load balancing techniques to maintain high performance. We expect that heterogeneity will also be a big issue for DBMSs or parallel I/O libraries designed for scientific applications running on networks of workstations, and the methods of allocating data to servers in these environments will need to be upgraded to take heterogeneity into account, while still allowing users to exert control over data layout. We propose such an approach to load balancing, under which we respect the user's choice of high level disk layout, but introduce automatic subchunking. The use of subchunks allows us to divide the very large chunks typically specified by the user's disk layout into more manageable size units that can be allocated to I/O nodes in a manner that fairly distributes the load. We also present two techniques for allocating subchunks to nodes, static and dynamic, and evaluate their performance on the SP2.","PeriodicalId":159935,"journal":{"name":"Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116678149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Large-sample and deterministic confidence intervals for online aggregation 在线聚合的大样本和确定性置信区间

Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150) Pub Date : 1997-08-11 DOI: 10.1109/SSDM.1997.621151

P. Haas

{"title":"Large-sample and deterministic confidence intervals for online aggregation","authors":"P. Haas","doi":"10.1109/SSDM.1997.621151","DOIUrl":"https://doi.org/10.1109/SSDM.1997.621151","url":null,"abstract":"The online aggregation system recently proposed by J.M. Hellerstein, et al. (1997) permits interactive exploration of large, complex datasets stored in relational database management systems. Running confidence intervals are an important component of an online aggregation system and indicate to the user the estimated proximity of each running aggregate to the corresponding final result. Large sample confidence intervals contain the final result with a prespecified probability and rest on central limit theorems, while deterministic confidence intervals contain the final query result with probability 1. We show how new and existing central limit theorems, simple bounding arguments, and the delta method can be used to derive formulas for both large sample and deterministic confidence intervals. To illustrate these techniques, we obtain formulas for running confidence intervals in the case of single table and multi table AVG, COUNT, SUM, VARIANCE, and STDEV queries with join and selection predicates. Duplicate elimination and GROUP-BY operations are also considered. We then provide numerically stable algorithms for computing the confidence intervals and analyzing the complexity of these algorithms.","PeriodicalId":159935,"journal":{"name":"Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127198454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 134