Conference on Innovative Data Systems Research最新文献

筛选
英文 中文
Lessons Learned from Managing a Petabyte 管理pb的经验教训
Conference on Innovative Data Systems Research Pub Date : 2005-01-20 DOI: 10.2172/839755
J. Becla, Daniel L. Wang
{"title":"Lessons Learned from Managing a Petabyte","authors":"J. Becla, Daniel L. Wang","doi":"10.2172/839755","DOIUrl":"https://doi.org/10.2172/839755","url":null,"abstract":"The amount of data collected and stored by the average business doubles each year. Many commercial databases are already approaching hundreds of terabytes, and at this rate, will soon be managing petabytes. More data enables new functionality and capability, but the larger scale reveals new problems and issues hidden in ''smaller'' terascale environments. This paper presents some of these new problems along with implemented solutions in the framework of a petabyte dataset for a large High Energy Physics experiment. Through experience with two persistence technologies, a commercial database and a file-based approach, we expose format-independent concepts and issues prevalent at this new scale of computing.","PeriodicalId":118073,"journal":{"name":"Conference on Innovative Data Systems Research","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2005-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117325813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 50
A Case for Staged Database Systems 分阶段数据库系统的案例
Conference on Innovative Data Systems Research Pub Date : 1900-01-01 DOI: 10.1007/978-0-387-39940-9_3675
S. Harizopoulos, A. Ailamaki
{"title":"A Case for Staged Database Systems","authors":"S. Harizopoulos, A. Ailamaki","doi":"10.1007/978-0-387-39940-9_3675","DOIUrl":"https://doi.org/10.1007/978-0-387-39940-9_3675","url":null,"abstract":"","PeriodicalId":118073,"journal":{"name":"Conference on Innovative Data Systems Research","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130792537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 75
Cache-Oblivious Query Processing 无关缓存的查询处理
Conference on Innovative Data Systems Research Pub Date : 1900-01-01 DOI: 10.14711/thesis-b1029228
Bingsheng He, Qiong Luo
{"title":"Cache-Oblivious Query Processing","authors":"Bingsheng He, Qiong Luo","doi":"10.14711/thesis-b1029228","DOIUrl":"https://doi.org/10.14711/thesis-b1029228","url":null,"abstract":"As CPU caches have become a performance bottleneck for main memory databases, optimizing the cache performance is essential for high-performance query processing on relational databases. Cache-oblivious techniques, proposed by the theory community, have optimal asymptotic bounds on the amount of data transferred between any two adjacent levels of an arbitrary memory hierarchy. Moreover, this optimal performance is achieved without any hardware platform specific tuning. These properties are highly attractive to autonomous databases, especially because the hardware architectures are becoming increasingly complex and diverse. \u0000In this thesis, we present the design, implementation, and evaluation of the first cache-oblivious, in-memory query processor, EaseDB. All query processing algorithms in EaseDB are designed to be cache-oblivious and match the performance of their cache-conscious counterparts. Moreover, we discuss the inherent limitations of the cache-oblivious approach as well as the opportunities given by the upcoming hardware architectures. Specifically, a cache-oblivious technique usually requires sophisticated algorithm design to achieve a comparable performance to its cache-conscious counterpart. Nevertheless, this development-time effort is compensated by the automaticity of performance achievement and the reduced ownership cost. We evaluate EaseDB in comparison with its cache-conscious counterparts on different architectures including Intel, AMD and Ultra-Sparc processors. Our results, with homegrown workloads and micro benchmarks, show that our cache-oblivious algorithms achieve a performance comparable to their fine-tuned cache-conscious counterparts. Moreover, cache-oblivious techniques can outperform their cache-conscious counterparts in multi-threading processors.","PeriodicalId":118073,"journal":{"name":"Conference on Innovative Data Systems Research","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132035911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
(Almost) Hands-Off Information Integration for the Life Sciences (几乎)不干涉的生命科学信息集成
Conference on Innovative Data Systems Research Pub Date : 1900-01-01 DOI: 10.18452/9201
U. Leser, Felix Naumann
{"title":"(Almost) Hands-Off Information Integration for the Life Sciences","authors":"U. Leser, Felix Naumann","doi":"10.18452/9201","DOIUrl":"https://doi.org/10.18452/9201","url":null,"abstract":"Data integration in complex domains, such as the life sciences, involves either manual data curation, offering highest information quality at highest price, or follows a schema integration and mapping approach, leading to moderate information quality at a moderate price. We suggest a radically different integration approach, called ALADIN, for the life sciences application domain. The predominant feature of the ALADIN system is an architecture that allows almost automatic integration of new data sources into the system, i.e., it offers data integration at almost no cost. We suggest a novel combination of data and text mining, schema matching, and duplicate detection to combat the reduction in information quality that seems inevitable when demanding a high degree of automatism. These heuristics can also lead to the detection of previously unknown or unseen relationships between objects, thus directly supporting the discovery-based work of life science researchers. We argue that such a system is a valuable contribution in two areas. First, it offers challenging and new problems for database research. Second, the ALADIN system would be a valuable knowledge resource for life science research.","PeriodicalId":118073,"journal":{"name":"Conference on Innovative Data Systems Research","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117344410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
DPI: The Data Processing Interface for Modern Networks DPI:现代网络的数据处理接口
Conference on Innovative Data Systems Research Pub Date : 1900-01-01 DOI: 10.18420/btw2019-ws-02
G. Alonso, Carsten Binnig, I. Pandis, K. Salem, Jan Skrzypczak, Ryan Stutsman, Lasse Thostrup, Tianzheng Wang, Zeke Wang, Tobias Ziegler
{"title":"DPI: The Data Processing Interface for Modern Networks","authors":"G. Alonso, Carsten Binnig, I. Pandis, K. Salem, Jan Skrzypczak, Ryan Stutsman, Lasse Thostrup, Tianzheng Wang, Zeke Wang, Tobias Ziegler","doi":"10.18420/btw2019-ws-02","DOIUrl":"https://doi.org/10.18420/btw2019-ws-02","url":null,"abstract":"As data processing evolves towards large scale, distributed platforms, the network will necessarily play a substantial role in achieving efficiency and performance. Increasingly, switches, network cards, and protocols are becoming more flexible while programmability at all levels (aka, software defined networks) opens up many possibilities to tailor the network to data processing applications and to push processing down to the network elements. \u0000 \u0000In this paper, we propose DPI, an interface providing a set of simple yet powerful abstractions flexible enough to exploit features of modern networks (e.g., RDMA or in-network processing) suitable for data processing. Mirroring the concept behind the Message Passing Interface (MPI) used extensively in high-performance computing, DPI is an interface definition rather than an implementation so as to be able to bridge different networking technologies and to evolve with them. In the paper we motivate and discuss key primitives of the interface and present a number of use cases that show the potential of DPI for data-intensive applications, such as analytic engines and distributed database systems.","PeriodicalId":118073,"journal":{"name":"Conference on Innovative Data Systems Research","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127836128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信