Dynamic Provisioning of Data Intensive Computing Middleware Frameworks: A Case Study

Proceedings of the 1st Workshop on The Science of Cyberinfrastructure: Research, Experience, Applications and Models Pub Date : 2015-06-16 DOI:10.1145/2753524.2753528

Linh Ngo, Michael E. Payne, Flavio Villanustre, Richard Taylor, A. Apon

{"title":"Dynamic Provisioning of Data Intensive Computing Middleware Frameworks: A Case Study","authors":"Linh Ngo, Michael E. Payne, Flavio Villanustre, Richard Taylor, A. Apon","doi":"10.1145/2753524.2753528","DOIUrl":null,"url":null,"abstract":"Big data has become an important asset for industry, and academic disciplines now utilize large-scale data in their research. This fourth paradigm of scientific research has led to the inclusion of data management, processing, and analytic tools into the traditional high performance computing software libraries. This integration is facilitated through a collection of supporting software components that comprise a data intensive computing middleware framework. From a shared campus cyberinfrastructure perspective, this represents a new challenge to the system administrators in balancing between the traditional high performance computing software stacks and the new data-intensive middleware on the same physical computing resource. In turn, this limits researchers from having access to the new middleware tools while administrators determine how to overcome the challenge. In this paper, we present our experience in configuring dynamic provisioning of two different data-intensive middleware frameworks from a user perspective. We describe the configuration process from setting up dependencies to deploying the middleware, and how this experience can be applied by other researchers and administrators.","PeriodicalId":321665,"journal":{"name":"Proceedings of the 1st Workshop on The Science of Cyberinfrastructure: Research, Experience, Applications and Models","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1st Workshop on The Science of Cyberinfrastructure: Research, Experience, Applications and Models","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2753524.2753528","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Big data has become an important asset for industry, and academic disciplines now utilize large-scale data in their research. This fourth paradigm of scientific research has led to the inclusion of data management, processing, and analytic tools into the traditional high performance computing software libraries. This integration is facilitated through a collection of supporting software components that comprise a data intensive computing middleware framework. From a shared campus cyberinfrastructure perspective, this represents a new challenge to the system administrators in balancing between the traditional high performance computing software stacks and the new data-intensive middleware on the same physical computing resource. In turn, this limits researchers from having access to the new middleware tools while administrators determine how to overcome the challenge. In this paper, we present our experience in configuring dynamic provisioning of two different data-intensive middleware frameworks from a user perspective. We describe the configuration process from setting up dependencies to deploying the middleware, and how this experience can be applied by other researchers and administrators.

查看原文本刊更多论文

数据密集型计算中间件框架的动态配置:一个案例研究

大数据已经成为产业的重要资产，各学科在研究中都在利用大数据。这第四种科学研究范式导致将数据管理、处理和分析工具包含到传统的高性能计算软件库中。这种集成是通过一组支持软件组件来实现的，这些组件组成了一个数据密集型计算中间件框架。从共享校园网络基础设施的角度来看，这对系统管理员在同一物理计算资源上平衡传统的高性能计算软件堆栈和新的数据密集型中间件提出了新的挑战。反过来，这限制了研究人员访问新的中间件工具，而管理员则决定如何克服这一挑战。在本文中，我们从用户的角度介绍了我们在配置两种不同的数据密集型中间件框架的动态供应方面的经验。我们描述了从设置依赖关系到部署中间件的配置过程，以及其他研究人员和管理员如何应用此经验。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 1st Workshop on The Science of Cyberinfrastructure: Research, Experience, Applications and Models

自引率

0.00%

发文量