Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)最新文献
Nuwan Goonasekera, Andrew Lonie, James Taylor, Enis Afgan
{"title":"CloudBridge: a Simple Cross-Cloud Python Library.","authors":"Nuwan Goonasekera, Andrew Lonie, James Taylor, Enis Afgan","doi":"10.1145/2949550.2949648","DOIUrl":"10.1145/2949550.2949648","url":null,"abstract":"<p><p>With clouds becoming a standard target for deploying applications, it is more important than ever to be able to seamlessly utilise resources and services from multiple providers. Proprietary vendor APIs make this challenging and lead to conditional code being written to accommodate various API differences, requiring application authors to deal with these complexities and to test their applications against each supported cloud. In this paper, we describe an open source Python library called CloudBridge that provides a simple, uniform, and extensible API for multiple clouds. The library defines a standard 'contract' that all supported providers must implement, and an extensive suite of conformance tests to ensure that any exposed behavior is uniform across cloud providers, thus allowing applications to confidently utilise any of the supported clouds without any cloud-specific code or testing.</p>","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"2016 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8375622/pdf/nihms-1689928.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39349009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On Automating XSEDE User Ticket Classification","authors":"Gwang Son, Victor Hazlewood, G. D. Peterson","doi":"10.1145/2616498.2616549","DOIUrl":"https://doi.org/10.1145/2616498.2616549","url":null,"abstract":"The XSEDE ticket system, which is a help desk ticketing system, receives email and web-based problem reports (i.e., tickets) from users and these tickets can be manually grouped into predefined categories either by the ticket submitter or by operations staff. This manual process can be automated by using text classification algorithms such as Multinomial Naive Bayes (MNB) or Softmax Regression Neural Network (SNN). Ticket subjects, rather than whole tickets, were used to make an input word list along with a manual word group list to enhance accuracy. The text mining algorithms used the input word list to select input words in the tickets. Compared with the Matlab svm() function, MNB and SNN showed overall better accuracy (up to ~85.8% using two simultaneous category selection). Also, the service provider resource (i.e., system name) information could be extracted from the tickets with ~90% accuracy.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"65 1","pages":"41:1-41:7"},"PeriodicalIF":0.0,"publicationDate":"2014-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74102510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Natasha Pavlovikj, Kevin Begcy, S. Behera, Malachy T. Campbell, H. Walia, J. Deogun
{"title":"Evaluating Distributed Platforms for Protein-Guided Scientific Workflow","authors":"Natasha Pavlovikj, Kevin Begcy, S. Behera, Malachy T. Campbell, H. Walia, J. Deogun","doi":"10.1145/2616498.2616551","DOIUrl":"https://doi.org/10.1145/2616498.2616551","url":null,"abstract":"Complex and large-scale applications in different scientific disciplines are often represented as a set of independent tasks, known as workflows. Many scientific workflows have intensive resource requirements. Therefore, different distributed platforms, including campus clusters, grids and clouds are used for efficient execution of these workflows. In this paper we examine the performance and the cost of running the Pegasus Workflow Management System (Pegasus WMS) implementation of blast2cap3, the protein-guided assembly approach, on three different execution platforms: Sandhills, the University of Nebraska Campus Cluster, the academic grid Open Science Gird (OSG), and the commercial cloud Amazon EC2. Furthermore, the behavior of the blast2cap3 workflow was tested with different number of tasks. For the used workflows and execution platforms, we perform multiple runs in order to compare the total workflow running time, as well as the different resource availability over time. Additionally, for the most interesting runs, the number of running versus the number of idle jobs over time was analyzed for each platform. The performed experiments show that using the Pegasus WMS implementation of blast2cap3 with more than 100 tasks significantly reduces the running time for all execution platforms. In general, for our workflow, better performance and resource usage were achieved when Amazon EC2 was used as an execution platform. However, due to the Amazon EC2 cost, the academic distributed systems can sometimes be a good alternative and have excellent performance, especially when there are plenty of resources available.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"16 1","pages":"38:1-38:8"},"PeriodicalIF":0.0,"publicationDate":"2014-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75261615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Challenges in particle tracking in turbulence on a massive scale","authors":"D. Buaria, P. Yeung","doi":"10.1145/2616498.2616526","DOIUrl":"https://doi.org/10.1145/2616498.2616526","url":null,"abstract":"An important but somewhat under-investigated issue in turbulence as a challenge in high-performance computing is the problem of interpolating, from a set of grid points, the velocity of many millions of fluid particles that wander in the flow field, which itself is divided into a larger number of sub-domains according to a chosen domain decomposition scheme. We present below the main elements of the algoithmic strategies that have led to reasonably good performance on two major Petascale computers, namely Stampede and Blue Waters. Performance data are presented at up to 16384 CPU cores for 64 million fluid particles.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"115 1","pages":"11:1-11:2"},"PeriodicalIF":0.0,"publicationDate":"2014-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77904802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance Study of a Minimalistic Simulator on XSEDE Massively Parallel Systems","authors":"Rong Rong, J. Hao, Jason Liu","doi":"10.1145/2616498.2616512","DOIUrl":"https://doi.org/10.1145/2616498.2616512","url":null,"abstract":"Scalable Simulation Framework (SSF), a parallel simulation application programming interface (API) for large-scale discrete-event models, has been widely adopted in many areas. This paper presents a simplified and yet more streamlined implementation, called MiniSSF. MiniSSF maintains the core design concept of SSF, while removing some of the complex but rarely used features, for sake of efficiency. It also introduces several new features that can greatly simplify model development efforts and/or improve the simulator's performance. More specifically, an automated compiler-based source-code translation scheme has been adopted in MiniSSF to enable scalable process-oriented simulation using handcrafted threads. A hierarchical hybrid synchronization algorithm has been incorporated in the simulator to improve parallel performance. Also, a new set of platform-independent API functions have been added for developing simulation models to be executed transparently on different parallel computing platforms. In this paper, we report performance results from experiments on different XSEDE platforms to assess the performance and scalability of MiniSSF. It is shown that the simulator can achieve superior performance. The simulator can adapt its synchronization according to the model's computation and communication demands, as well as the underlying parallel platform. The results also suggest that more automatic adaptation and fine-grained performance tuning is necessary for handling more complex large-scale simulation scenarios.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"11 1","pages":"15:1-15:8"},"PeriodicalIF":0.0,"publicationDate":"2014-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85630399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamically Provisioning Portable Gateway Infrastructure Using Docker and Agave","authors":"R. Dooley, Joe Stubbs","doi":"10.1145/2616498.2616561","DOIUrl":"https://doi.org/10.1145/2616498.2616561","url":null,"abstract":"The iPlant Agave Developer APIs are a Science-as-a-Service platform for developing modern science gateways. One trend we see emerging from our users is the aggregation of many different, distributed compute and storage systems. The rise in popularity in IaaS, PaaS, and container technologies has made the rapid deployment of elastic gateway infrastructure a reality. In this talk we will introduce Docker and the Agave Developer APIs then demonstrate how to use them to provision applications and infrastructure that are portable across any Linux hosting environment. We will conclude by using our lightweight gateway technology, GatewayDNA, to run an application and move data across multiple systems simultaneously.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"28 6 1","pages":"55:1-55:2"},"PeriodicalIF":0.0,"publicationDate":"2014-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82787939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ye Fan, Yan Y. Liu, Shaowen Wang, D. Tarboton, Ahmet Artu Yildirim, Nancy Wilkins-Diehr
{"title":"Accelerating TauDEM as a Scalable Hydrological Terrain Analysis Service on XSEDE","authors":"Ye Fan, Yan Y. Liu, Shaowen Wang, D. Tarboton, Ahmet Artu Yildirim, Nancy Wilkins-Diehr","doi":"10.1145/2616498.2616510","DOIUrl":"https://doi.org/10.1145/2616498.2616510","url":null,"abstract":"In this paper, we present the experience of scaling a parallel hydrological analysis software - TauDEM - to thousands of processors and large elevation datasets through XSEDE ECSS effort and multi-institutional collaboration.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"2016 1","pages":"5:1-5:2"},"PeriodicalIF":0.0,"publicationDate":"2014-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86681153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Moore, C. Baru, Diane A. Baxter, Geoffrey Fox, A. Majumdar, P. Papadopoulos, W. Pfeiffer, R. Sinkovits, Shawn M. Strande, M. Tatineni, R. Wagner, Nancy Wilkins-Diehr, M. Norman
{"title":"Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science","authors":"R. Moore, C. Baru, Diane A. Baxter, Geoffrey Fox, A. Majumdar, P. Papadopoulos, W. Pfeiffer, R. Sinkovits, Shawn M. Strande, M. Tatineni, R. Wagner, Nancy Wilkins-Diehr, M. Norman","doi":"10.1145/2616498.2616540","DOIUrl":"https://doi.org/10.1145/2616498.2616540","url":null,"abstract":"NSF-funded computing centers have primarily focused on delivering high-performance computing resources to academic researchers with the most computationally demanding applications. But now that computational science is so pervasive, there is a need for infrastructure that can serve more researchers and disciplines than just those at the peak of the HPC pyramid. Here we describe SDSC's Comet system, which is scheduled for production in January 2015 and was designed to address the needs of a much larger and more expansive science community-- the \"long tail of science\". Comet will have a peak performance of 2 petaflop/s, mostly delivered using Intel's next generation Xeon processor. It will include some large-memory and GPU-accelerated nodes, node-local flash memory, 7 PB of Performance Storage, and 6 PB of Durable Storage. These features, together with the availability of high performance virtualization, will enable users to run complex, heterogeneous workloads on a single integrated resource.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"1 1","pages":"39:1-39:8"},"PeriodicalIF":0.0,"publicationDate":"2014-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89852771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A leap forward with UTK's Cray XC30","authors":"M. Fahey","doi":"10.1145/2616498.2616546","DOIUrl":"https://doi.org/10.1145/2616498.2616546","url":null,"abstract":"This paper shows a significant productivity leap for several science groups and the accomplishments they have made to date on Darter - a Cray XC30 at the University of Tennessee Knoxville. The increased productivity is due to faster processors and interconnect combined in a new generation from Cray, and yet it still has a very similar programming environment as compared to previous generations of Cray machines that makes porting easy.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"1 1","pages":"30:1-30:8"},"PeriodicalIF":0.0,"publicationDate":"2014-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78193394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Three-Semester, Interdisciplinary Approach to Parallel Programming in a Liberal Arts University Setting","authors":"Mike Morris, Karl Frinkle","doi":"10.1145/2616498.2616567","DOIUrl":"https://doi.org/10.1145/2616498.2616567","url":null,"abstract":"We describe a successful addition of high performance computing (HPC) into a traditional computer science curriculum at a liberal arts university. The approach incorporated a three-semester sequence of courses emphasizing parallel programming techniques, with the final course focusing on a research-level mathematical project that was executed on a TOP500 supercomputer. A group of students with varied programming backgrounds participated in the program. Emphasis was placed on utilizing the Open MPI and CUDA libraries along with parallel algorithm and file I/O analysis.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"30 1","pages":"66:1-66:7"},"PeriodicalIF":0.0,"publicationDate":"2014-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75046159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}