Scientific Cloud Computing最新文献

Auto-scaling of virtual resources for scientific workflows on hybrid clouds 混合云上科学工作流的虚拟资源自动伸缩

Scientific Cloud Computing Pub Date : 2014-06-23 DOI: 10.1145/2608029.2608036

Younsun Ahn, Yoonhee Kim

{"title":"Auto-scaling of virtual resources for scientific workflows on hybrid clouds","authors":"Younsun Ahn, Yoonhee Kim","doi":"10.1145/2608029.2608036","DOIUrl":"https://doi.org/10.1145/2608029.2608036","url":null,"abstract":"Cloud computing technology enables applications to employ scalable resources dynamically. Scientists can promote large-scale scientific computational experiments over cloud environment. It is essential for many-task-computing (MTC) to certificate stable executions of applications even rapid changes of vital status of physical resources and furnish high performance resources in a long period. Auto-scaling with virtualization provides efficient and integrated cloud resource utilization. Auto-scaling issues have been actively studied as effective resource management in order to utilize large-scale data center in a good shape but most of the auto-scaling methods just easily support performance metrics such as CPU utilization and data transfer latency but seldom consider execution deadline or characteristics of an application. We propose an auto-scaling method that finishes all tasks by user specified deadline. We accomplish our goal by dynamically allocating VMs to maximize resource utilization while meeting a deadline and considering task dependency and data transfer time in workflow application. We have evaluated our auto-scaling method with protein annotation workflow application which tasks are specified as a workflow in hybrid cloud environment. The results of a simulation show the method performs automatically resource allocation actually needed satisfying deadline constraints.","PeriodicalId":443577,"journal":{"name":"Scientific Cloud Computing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126472875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

HEP computing in a context-aware cloud environment 上下文感知云环境中的HEP计算

Scientific Cloud Computing Pub Date : 2014-06-23 DOI: 10.1145/2608029.2608035

F. Berghaus, R. Desmarais, I. Gable, C. Leavett-Brown, M. Paterson, R. P. Taylor, A. Charbonneau, R. Sobie

引用次数: 0

A distributed architecture for intra- and inter-cloud data management 用于云内和云间数据管理的分布式架构

Scientific Cloud Computing Pub Date : 2014-06-23 DOI: 10.1145/2608029.2608037

I. Kelley

引用次数: 2

Science in the cloud: lessons from three years of research projects on microsoft azure 云中的科学:来自微软azure三年研究项目的经验教训

Scientific Cloud Computing Pub Date : 2014-06-23 DOI: 10.1145/2608029.2608030

Dennis Gannon, D. Fay, Daron Green, Kenji Takeda, Wenming Yi

引用次数: 8

A cloud computing approach to on-demand and scalable cybergis analytics 云计算方法的按需和可扩展的网络地理分析

Scientific Cloud Computing Pub Date : 2014-06-23 DOI: 10.1145/2608029.2608032

Pierre Riteau, Myunghwa Hwang, Anand Padmanabhan, Yizhao Gao, Yan Y. Liu, K. Keahey, Shaowen Wang

{"title":"A cloud computing approach to on-demand and scalable cybergis analytics","authors":"Pierre Riteau, Myunghwa Hwang, Anand Padmanabhan, Yizhao Gao, Yan Y. Liu, K. Keahey, Shaowen Wang","doi":"10.1145/2608029.2608032","DOIUrl":"https://doi.org/10.1145/2608029.2608032","url":null,"abstract":"Spatial data analysis has become ubiquitous as geographic information systems (GIS) are widely used to support scientific investigations and decision making in many fields of science, engineering, and humanities (e.g., ecology, emergency management, environmental engineering and sciences, geosciences, and social sciences). Tremendous data and computational capabilities are needed to handle and analyze massive quantities of spatial data that are collected across multiple spatiotemporal scales and used for diverse purposes. CyberGIS has emerged as a new-generation GIS based on advanced cyberinfrastructure to seamlessly integrate such capabilities into scalable geospatial analytics and modeling tools. One of the key challenges and opportunities of CyberGIS research is to build an on-demand service framework that can manage underlying cyberinfrastructure resources dynamically, in order to provide responsive support for interactive online CyberGIS analytics for which users can generate massive service requests in a short amount of time. This paper presents a cloud computing approach to implementing CyberGIS analytics using cloud computing services in the CyberGIS Gateway, a multiuser and collaborative online problem-solving environment. The primary purpose of this research is to address the question of how to achieve on-demand and scalable CyberGIS analytics that provide a stable response time to the user. We do that through integration with the Nimbus Phantom cloud platform. We then investigate how the cloud platform is able to adaptively handle fluctuating requests for analytics while providing a stable response time.","PeriodicalId":443577,"journal":{"name":"Scientific Cloud Computing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128181366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Cloud computing data capsules for non-consumptiveuse of texts 用于文本非消耗性使用的云计算数据胶囊

Scientific Cloud Computing Pub Date : 2014-06-23 DOI: 10.1145/2608029.2608031

Jiaan Zeng, Guangchen Ruan, Alexander Crowell, A. Prakash, Beth Plale

引用次数: 40

Evaluating storage systems for scientific data in the cloud 评估云中的科学数据存储系统

Scientific Cloud Computing Pub Date : 2014-06-23 DOI: 10.1145/2608029.2608034

K. Maheshwari, J. Wozniak, Hao Yang, D. Katz, M. Ripeanu, V. Zavala, M. Wilde

{"title":"Evaluating storage systems for scientific data in the cloud","authors":"K. Maheshwari, J. Wozniak, Hao Yang, D. Katz, M. Ripeanu, V. Zavala, M. Wilde","doi":"10.1145/2608029.2608034","DOIUrl":"https://doi.org/10.1145/2608029.2608034","url":null,"abstract":"Infrastructure-as-a-Service (IaaS) clouds are an appealing resource for scientific computing. However, the bare-bones presentation of raw Linux virtual machines leaves much to the application developer. For many cloud applications, effective data handling is critical to efficient application execution. This paper investigates the capabilities of a variety of POSIX-accessible distributed storage systems to manage data access patterns resulting from workflow application executions in the cloud. We leverage the expressivity of the Swift parallel scripting framework to benchmark the performance of a number of storage systems using synthetic workloads and three real-world applications. We characterize two representative commercial storage systems (Amazon S3 and HDFS, respectively) and two emerging research-based storage systems (Chirp/Parrot and MosaStore). We find the use of aggregated node-local resources effective and economical compared with remotely located S3 storage. Our experiments show that applications run at scale with MosaStore show up to 30% improvement in makespan time compared with those run with S3. We also find that storage-system driven application deployments in the cloud results in better runtime performance compared with an on-demand data-staging driven approach.","PeriodicalId":443577,"journal":{"name":"Scientific Cloud Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126771017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Mux-Kmeans: multiplex kmeans for clustering large-scale data set multi - kmeans:用于大规模数据集聚类的多重kmeans

Scientific Cloud Computing Pub Date : 2014-06-23 DOI: 10.1145/2608029.2608033

Chen Li, Yanfeng Zhang, Ming-hai Jiao, Ge Yu

{"title":"Mux-Kmeans: multiplex kmeans for clustering large-scale data set","authors":"Chen Li, Yanfeng Zhang, Ming-hai Jiao, Ge Yu","doi":"10.1145/2608029.2608033","DOIUrl":"https://doi.org/10.1145/2608029.2608033","url":null,"abstract":"Kmeans clustering algorithm is widely used in a number of scientific applications due to its simple iterative nature and ease of implementation. The quality of clustering result highly depends on the selection of initial centroids. Different selections of initial centroids result in different clustering results. In practice, people run a series of Kmeans processes with multiple initial centroid groups serially and return the best clustering result among them. However, in the era of big data, a Kmeans process is implemented on MapReduce to scale to large data sets. Even a single Kmeans process on MapReduce requires considerable long runtime. This paper proposes Mux-Kmeans. Rather than executing multiple Kmeans processes serially, Mux-Kmeans launches these Kmeans processes concurrently with multiple centroid groups. In each iteration, Mux-Kmeans (i) evaluates these Kmeans processes, (ii) prunes the low-quality Kmeans processes, and (iii) incubates new Kmeans processes. After a certain number of iterations, it finally obtains the best among these local optimal results. We implement Mux-Kmeans on MapReduce and evaluate it on Amazon EC2. The experimental results show that starting from the same initial centroid groups, the clustering result of Mux-Kmeans is always non-worse than the best of a series of Kmeans processes. Mux-Kmeans also saves elapsed time than serial multiple Kmeans processes.","PeriodicalId":443577,"journal":{"name":"Scientific Cloud Computing","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121805538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9