存储管理通过密度估计实现数据抽象

K. Meier
{"title":"存储管理通过密度估计实现数据抽象","authors":"K. Meier","doi":"10.1109/SSDM.1997.621149","DOIUrl":null,"url":null,"abstract":"One way to cope with the constantly growing amount of scientific data to be analyzed is to derive data abstractions from the original data. Data abstractions can provide a representation of the data in compressed form where the data's semantic structure is maintained. The author has explored data abstractions based on density estimation. The method to estimate the density of scientific data sets is based on the directory of a multidimensional data access structure. This data density estimator is called directory estimator. It is based on multidimensional adaptive histograms and is therefore computationally efficient, even for large data sets and many dimensions. The paper describes the methodology in general and focuses on the estimator's accuracy in particular. The accuracy of the directory estimator depends on the parameters of the access structures used, such as the bucket capacity. She evaluates the choice of bucket capacity theoretically as well as empirically with the ISE (integrated squared error) being the measure of error and using a grid file as the data access structure. A useful application of the directory estimator in the field of scientific data is presented with a practical example from astronomy.","PeriodicalId":159935,"journal":{"name":"Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Data abstraction through density estimation by storage management\",\"authors\":\"K. Meier\",\"doi\":\"10.1109/SSDM.1997.621149\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One way to cope with the constantly growing amount of scientific data to be analyzed is to derive data abstractions from the original data. Data abstractions can provide a representation of the data in compressed form where the data's semantic structure is maintained. The author has explored data abstractions based on density estimation. The method to estimate the density of scientific data sets is based on the directory of a multidimensional data access structure. This data density estimator is called directory estimator. It is based on multidimensional adaptive histograms and is therefore computationally efficient, even for large data sets and many dimensions. The paper describes the methodology in general and focuses on the estimator's accuracy in particular. The accuracy of the directory estimator depends on the parameters of the access structures used, such as the bucket capacity. She evaluates the choice of bucket capacity theoretically as well as empirically with the ISE (integrated squared error) being the measure of error and using a grid file as the data access structure. A useful application of the directory estimator in the field of scientific data is presented with a practical example from astronomy.\",\"PeriodicalId\":159935,\"journal\":{\"name\":\"Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150)\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1997-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SSDM.1997.621149\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SSDM.1997.621149","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

处理不断增长的需要分析的科学数据量的一种方法是从原始数据中派生出数据抽象。数据抽象可以提供压缩形式的数据表示,在这种形式下数据的语义结构得到维护。作者探索了基于密度估计的数据抽象。科学数据集密度的估计方法是基于多维数据访问结构的目录。这个数据密度估计器称为目录估计器。它基于多维自适应直方图,因此计算效率很高,即使对于大型数据集和许多维度也是如此。本文概述了估计器的总体方法,并着重讨论了估计器的精度。目录估计器的准确性取决于所使用的访问结构的参数,例如桶容量。她从理论上和经验上评估了桶容量的选择,ISE(综合平方误差)是误差的度量,并使用网格文件作为数据访问结构。通过天文学中的一个实例,介绍了目录估计器在科学数据领域中的一个有用的应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Data abstraction through density estimation by storage management
One way to cope with the constantly growing amount of scientific data to be analyzed is to derive data abstractions from the original data. Data abstractions can provide a representation of the data in compressed form where the data's semantic structure is maintained. The author has explored data abstractions based on density estimation. The method to estimate the density of scientific data sets is based on the directory of a multidimensional data access structure. This data density estimator is called directory estimator. It is based on multidimensional adaptive histograms and is therefore computationally efficient, even for large data sets and many dimensions. The paper describes the methodology in general and focuses on the estimator's accuracy in particular. The accuracy of the directory estimator depends on the parameters of the access structures used, such as the bucket capacity. She evaluates the choice of bucket capacity theoretically as well as empirically with the ISE (integrated squared error) being the measure of error and using a grid file as the data access structure. A useful application of the directory estimator in the field of scientific data is presented with a practical example from astronomy.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信