水星系统:利用真正快速的硬件进行数据搜索

R. Chamberlain, R. Cytron, M. Franklin, R. Indeck
{"title":"水星系统:利用真正快速的硬件进行数据搜索","authors":"R. Chamberlain, R. Cytron, M. Franklin, R. Indeck","doi":"10.1145/1162618.1162626","DOIUrl":null,"url":null,"abstract":"In many data mining applications, the size of the database is not only extremely large, it is also growing rapidly. Even for relatively simple searches, the time required to move the data off magnetic media, cross the system bus into main memory, copy into processor cache, and then execute code to perform a search is prohibitive. We are building a system in which a significant portion of the data mining task (i.e., the portion that examines the bulk of the raw data) is implemented in fast hardware, close to the magnetic media on which it is stored. Furthermore, this hardware can be replicated allowing mining tasks to be performed in parallel, thus providing further speedup for the overall mining application. In this paper, we describe a general framework under which this can be accomplished and provide initial performance results for a set of applications.","PeriodicalId":447113,"journal":{"name":"International Workshop on Storage Network Architecture and Parallel I/Os","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"48","resultStr":"{\"title\":\"The Mercury system: exploiting truly fast hardware for data search\",\"authors\":\"R. Chamberlain, R. Cytron, M. Franklin, R. Indeck\",\"doi\":\"10.1145/1162618.1162626\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In many data mining applications, the size of the database is not only extremely large, it is also growing rapidly. Even for relatively simple searches, the time required to move the data off magnetic media, cross the system bus into main memory, copy into processor cache, and then execute code to perform a search is prohibitive. We are building a system in which a significant portion of the data mining task (i.e., the portion that examines the bulk of the raw data) is implemented in fast hardware, close to the magnetic media on which it is stored. Furthermore, this hardware can be replicated allowing mining tasks to be performed in parallel, thus providing further speedup for the overall mining application. In this paper, we describe a general framework under which this can be accomplished and provide initial performance results for a set of applications.\",\"PeriodicalId\":447113,\"journal\":{\"name\":\"International Workshop on Storage Network Architecture and Parallel I/Os\",\"volume\":\"69 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-09-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"48\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Workshop on Storage Network Architecture and Parallel I/Os\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1162618.1162626\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Workshop on Storage Network Architecture and Parallel I/Os","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1162618.1162626","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 48

摘要

在许多数据挖掘应用中,数据库的规模不仅非常大,而且还在迅速增长。即使是相对简单的搜索,将数据移出磁性介质、穿过系统总线进入主存、复制到处理器缓存,然后执行执行搜索的代码所需的时间也是令人望而却步的。我们正在构建一个系统,其中数据挖掘任务的重要部分(即检查大量原始数据的部分)在快速硬件中实现,靠近存储数据的磁性介质。此外,该硬件可以复制,允许并行执行挖掘任务,从而为整个挖掘应用程序提供进一步的加速。在本文中,我们描述了一个通用框架,在这个框架下可以完成这一任务,并为一组应用程序提供初步的性能结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
The Mercury system: exploiting truly fast hardware for data search
In many data mining applications, the size of the database is not only extremely large, it is also growing rapidly. Even for relatively simple searches, the time required to move the data off magnetic media, cross the system bus into main memory, copy into processor cache, and then execute code to perform a search is prohibitive. We are building a system in which a significant portion of the data mining task (i.e., the portion that examines the bulk of the raw data) is implemented in fast hardware, close to the magnetic media on which it is stored. Furthermore, this hardware can be replicated allowing mining tasks to be performed in parallel, thus providing further speedup for the overall mining application. In this paper, we describe a general framework under which this can be accomplished and provide initial performance results for a set of applications.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信