Introducing map-reduce to high end computing

Grant Mackey, S. Sehrish, John Bent, J. López, S. Habib, J. Wang
{"title":"Introducing map-reduce to high end computing","authors":"Grant Mackey, S. Sehrish, John Bent, J. López, S. Habib, J. Wang","doi":"10.1109/PDSW.2008.4811889","DOIUrl":null,"url":null,"abstract":"In this work we present an scientific application that has been given a Hadoop MapReduce implementation. We also discuss other scientific fields of supercomputing that could benefit from a MapReduce implementation. We recognize in this work that Hadoop has potential benefit for more applications than simply data mining, but that it is not a panacea for all data intensive applications. We provide an example of how the halo finding application, when applied to large astrophysics datasets, benefits from the model of the Hadoop architecture. The halo finding application uses a friends of friends algorithm to quickly cluster together large sets of particles to output files which a visualization software can interpret. The current implementation requires that large datasets be moved from storage to computation resources for every simulation of astronomy data. Our Hadoop implementation allows for an in-place halo finding application on the datasets, which removes the time consuming process of transferring data between resources.","PeriodicalId":227342,"journal":{"name":"2008 3rd Petascale Data Storage Workshop","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"75","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 3rd Petascale Data Storage Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDSW.2008.4811889","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 75

Abstract

In this work we present an scientific application that has been given a Hadoop MapReduce implementation. We also discuss other scientific fields of supercomputing that could benefit from a MapReduce implementation. We recognize in this work that Hadoop has potential benefit for more applications than simply data mining, but that it is not a panacea for all data intensive applications. We provide an example of how the halo finding application, when applied to large astrophysics datasets, benefits from the model of the Hadoop architecture. The halo finding application uses a friends of friends algorithm to quickly cluster together large sets of particles to output files which a visualization software can interpret. The current implementation requires that large datasets be moved from storage to computation resources for every simulation of astronomy data. Our Hadoop implementation allows for an in-place halo finding application on the datasets, which removes the time consuming process of transferring data between resources.
将map-reduce引入高端计算
在这项工作中,我们提出了一个科学应用程序,该应用程序已经实现了Hadoop MapReduce。我们还讨论了其他可以从MapReduce实现中受益的超级计算科学领域。在这项工作中,我们认识到Hadoop对更多应用程序有潜在的好处,而不仅仅是数据挖掘,但它并不是所有数据密集型应用程序的灵丹妙药。我们提供了一个例子,说明光晕查找应用程序在应用于大型天体物理数据集时,如何从Hadoop架构模型中受益。光晕查找应用程序使用朋友的朋友算法将大量粒子快速聚类到一起,输出文件,可视化软件可以解释这些文件。目前的实现需要将大型数据集从存储转移到计算资源,以进行每次天文数据模拟。我们的Hadoop实现允许在数据集上进行就地光环查找应用程序,从而消除了在资源之间传输数据的耗时过程。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信