水星系统:利用真正快速的硬件进行数据搜索

International Workshop on Storage Network Architecture and Parallel I/Os Pub Date : 2003-09-28 DOI:10.1145/1162618.1162626

R. Chamberlain, R. Cytron, M. Franklin, R. Indeck

{"title":"水星系统:利用真正快速的硬件进行数据搜索","authors":"R. Chamberlain, R. Cytron, M. Franklin, R. Indeck","doi":"10.1145/1162618.1162626","DOIUrl":null,"url":null,"abstract":"In many data mining applications, the size of the database is not only extremely large, it is also growing rapidly. Even for relatively simple searches, the time required to move the data off magnetic media, cross the system bus into main memory, copy into processor cache, and then execute code to perform a search is prohibitive. We are building a system in which a significant portion of the data mining task (i.e., the portion that examines the bulk of the raw data) is implemented in fast hardware, close to the magnetic media on which it is stored. Furthermore, this hardware can be replicated allowing mining tasks to be performed in parallel, thus providing further speedup for the overall mining application. In this paper, we describe a general framework under which this can be accomplished and provide initial performance results for a set of applications.","PeriodicalId":447113,"journal":{"name":"International Workshop on Storage Network Architecture and Parallel I/Os","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"48","resultStr":"{\"title\":\"The Mercury system: exploiting truly fast hardware for data search\",\"authors\":\"R. Chamberlain, R. Cytron, M. Franklin, R. Indeck\",\"doi\":\"10.1145/1162618.1162626\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In many data mining applications, the size of the database is not only extremely large, it is also growing rapidly. Even for relatively simple searches, the time required to move the data off magnetic media, cross the system bus into main memory, copy into processor cache, and then execute code to perform a search is prohibitive. We are building a system in which a significant portion of the data mining task (i.e., the portion that examines the bulk of the raw data) is implemented in fast hardware, close to the magnetic media on which it is stored. Furthermore, this hardware can be replicated allowing mining tasks to be performed in parallel, thus providing further speedup for the overall mining application. In this paper, we describe a general framework under which this can be accomplished and provide initial performance results for a set of applications.\",\"PeriodicalId\":447113,\"journal\":{\"name\":\"International Workshop on Storage Network Architecture and Parallel I/Os\",\"volume\":\"69 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-09-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"48\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Workshop on Storage Network Architecture and Parallel I/Os\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1162618.1162626\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Workshop on Storage Network Architecture and Parallel I/Os","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1162618.1162626","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 48

摘要

在许多数据挖掘应用中，数据库的规模不仅非常大，而且还在迅速增长。即使是相对简单的搜索，将数据移出磁性介质、穿过系统总线进入主存、复制到处理器缓存，然后执行执行搜索的代码所需的时间也是令人望而却步的。我们正在构建一个系统，其中数据挖掘任务的重要部分(即检查大量原始数据的部分)在快速硬件中实现，靠近存储数据的磁性介质。此外，该硬件可以复制，允许并行执行挖掘任务，从而为整个挖掘应用程序提供进一步的加速。在本文中，我们描述了一个通用框架，在这个框架下可以完成这一任务，并为一组应用程序提供初步的性能结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

The Mercury system: exploiting truly fast hardware for data search

In many data mining applications, the size of the database is not only extremely large, it is also growing rapidly. Even for relatively simple searches, the time required to move the data off magnetic media, cross the system bus into main memory, copy into processor cache, and then execute code to perform a search is prohibitive. We are building a system in which a significant portion of the data mining task (i.e., the portion that examines the bulk of the raw data) is implemented in fast hardware, close to the magnetic media on which it is stored. Furthermore, this hardware can be replicated allowing mining tasks to be performed in parallel, thus providing further speedup for the overall mining application. In this paper, we describe a general framework under which this can be accomplished and provide initial performance results for a set of applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Workshop on Storage Network Architecture and Parallel I/Os

自引率

0.00%

发文量