L. Koesterke, K. Milfeld, M. Vaughn, D. Stanzione, J. Koltes, N. Weeks, J. Reecy
{"title":"Optimizing the PCIT algorithm on stampede's Xeon and Xeon Phi processors for faster discovery of biological networks","authors":"L. Koesterke, K. Milfeld, M. Vaughn, D. Stanzione, J. Koltes, N. Weeks, J. Reecy","doi":"10.1145/2484762.2484794","DOIUrl":"https://doi.org/10.1145/2484762.2484794","url":null,"abstract":"The PCIT method is an important technique for detecting interactions between networks. The PCIT algorithm has been used in the biological context to infer complex regulatory mechanisms and interactions in genetic networks, in genome wide association studies, and in other similar problems. In this work, the PCIT algorithm is re-implemented with exemplary parallel, vector, I/O, memory and instruction optimizations for today's multi- and many-core architectures. The evolution and performance of the new code targets the processor architectures of the Stampede supercomputer, but will also benefit other architectures. The Stampede system consists of an Intel Xeon E5 processor base system with an innovative component comprised of Intel Xeon Phi Coprocessors. Optimized results and an analysis are presented for both the Xeon and the Xeon Phi.","PeriodicalId":426819,"journal":{"name":"Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114467364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploiting MapReduce and data compression for data-intensive applications","authors":"Guangchen Ruan, Hui Zhang, Beth Plale","doi":"10.1145/2484762.2484785","DOIUrl":"https://doi.org/10.1145/2484762.2484785","url":null,"abstract":"HPC platform shows good success for predominantly compute-intensive jobs, however, data intensive jobs still struggle on HPC platform as large amounts of concurrent data movement from I/O nodes to compute nodes can easily saturate the network links. MapReduce, the \"moving computation to data\" paradigm for many pleasingly parallel applications, assumes that data are resident on local disks and computation is scheduled where the data are located. However, on an HPC machine data must be staged from a broader file system (such as Luster), to HDFS where it can be accessed; this staging can represent a substantial delay in processing. In this paper we look at data compression's effect on reducing bandwidth needs of getting data to the application, as well as its impact on the overall performance of data-intensive applications. Our study examines two types of applications, a 3D-time series caries lesion assessment focusing on large scale medical image dataset, and a HTRC word counting task concerning large scale text analysis running on XSEDE resources. Our extensive experimental results demonstrate significant performance improvement in terms of storage space, data stage-in time, and job execution time.","PeriodicalId":426819,"journal":{"name":"Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129190144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Biomedical CyberInfrastructure challenges","authors":"Claudiu Farcas, N. Balac, L. Ohno-Machado","doi":"10.1145/2484762.2484767","DOIUrl":"https://doi.org/10.1145/2484762.2484767","url":null,"abstract":"Biomedical research traverses a new era of advancements through the adoption of massive computing and big-data solutions to major scientific problems. However, the road ahead is far from \"a walk in a park\" -- many obstacles exist in the standardization, adoption, and evolution of methods, practices, algorithms, tools, and ultimately knowledge, that would mature along this road. In this article, we discuss such challenges that we encountered in this field and possible solutions from the iDASH program that closely engages this community.","PeriodicalId":426819,"journal":{"name":"Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131080867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Landau, Greg Mulder, Raquell Holmes, Sofya Borinskaya, Nam-Hwa Kang, C. Bordeianu
{"title":"INSTANCES: incorporating computational scientific thinking advances into education & science courses","authors":"R. Landau, Greg Mulder, Raquell Holmes, Sofya Borinskaya, Nam-Hwa Kang, C. Bordeianu","doi":"10.1145/2484762.2484769","DOIUrl":"https://doi.org/10.1145/2484762.2484769","url":null,"abstract":"The INSTANCES project strives to create science educational materials that incorporate computation as an essential element [1]. Figure 1 illustrates how the authors incorporate this modern approach of scientific problem solving. Although a decade ago the combination of computing, science and applied mathematics known as computational science was rarely known beyond a few research universities, today K-12 organizations such as the Computer Science Teachers Association [2] and the National Science Teachers Association [3] recommend that secondary school classrooms teach simulation as a cornerstone of scientific inquiry.","PeriodicalId":426819,"journal":{"name":"Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery","volume":"124 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132559417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Getting started with high performance computing for humanities, arts, and social science","authors":"Alan B. Craig","doi":"10.1145/2484762.2484788","DOIUrl":"https://doi.org/10.1145/2484762.2484788","url":null,"abstract":"This abstract and presentation addresses the question of \"Why would someone in humanities, arts, or social science be interested in high performance computing?\", and discusses the resources and assistance that are available to humanists, artists, and social scientists who are interested in high performance computing. The Extreme Science And Engineering Discovery Environment (XSEDE) provides a network of high performance computing resources that are available to researchers. In this talk I will discuss the resources that are available, who is eligible for these resources, and assistance that is available to help you use those resources. My role within XSEDE is to help you get started on XSEDE as well as to help you after you get resources allocated. In this talk I will walk you through the process of applying for an XSEDE startup account and let you know what to expect as you begin using the resources. I will also discuss some of the different types of projects that have been done by humanities, arts, and social science researchers which range from large scale analysis of texts, images and videos, network analysis (including social media), map based problems, simulations, and others. Finally, I will address some of the lessons I have learned from working with humanities, arts, and social science researchers who are using XSEDE resources. Whether you need computational power, storage, assistance with analysis of large datasets, or are just curious of what these types of resources can do for you, this talk will provide answers that you are looking for.","PeriodicalId":426819,"journal":{"name":"Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125914225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sitao Wu, Weizhong Li, L. Smarr, K. Nelson, Shibu Yooseph, M. Torralba
{"title":"Large memory high performance computing enables comparison across human gut microbiome of patients with autoimmune diseases and healthy subjects","authors":"Sitao Wu, Weizhong Li, L. Smarr, K. Nelson, Shibu Yooseph, M. Torralba","doi":"10.1145/2484762.2484828","DOIUrl":"https://doi.org/10.1145/2484762.2484828","url":null,"abstract":"Microbial communities that live on the outside and inside of the human body dramatically influence human health and diseases. In recent years, major progress has been made in understanding the human microbiome communities through projects such as the Human Microbiome Project (http://commonfund.nih.gov/hmp/), using next generation sequencing technologies and metagenomic approaches. In this paper, we describe a comparative computational analysis of 183 human gut microbiome sequence datasets, drawn from healthy individuals as well as those with autoimmune diseases. About 2.4 TB of Illumina deep sequencing metagenomic data were analyzed using computational workflows we developed, which run multiple steps of data- and computing-intensive analyses such as mapping, sequence assembly, gene identification, clustering and functional annotations. The analyses were carried out on the Gordon supercomputer at the San Diego Supercomputer Center (SDSC), using ~180,000 core hours and tens of TB storage space. Our analysis reveals the detailed microbial composition, dynamics, and functional profiles of the samples and provides new insight into how to correlate microbial profiles with human health and disease states.","PeriodicalId":426819,"journal":{"name":"Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery","volume":"271 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130310516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scalasca support for MPI+OpenMP parallel applications on large-scale HPC systems based on Intel Xeon Phi","authors":"B. Wylie, W. Frings","doi":"10.1145/2484762.2484777","DOIUrl":"https://doi.org/10.1145/2484762.2484777","url":null,"abstract":"Intel Xeon Phi coprocessors based on the Many Integrated Core (MIC) architecture are starting to appear in HPC systems, with Stampede being a prominent example available within the XSEDE cyber-infrastructure. Porting MPI and OpenMP applications to such systems is often no more than simple recompilation, however, execution performance needs to be carefully analyzed and tuned to effectively exploit their unique capabilities. For performance measurement and analysis tools, the variety of execution modes need to be supported in a consistent and convenient manner, and especially execution configurations involving large numbers of compute nodes each with several multicore host processors and many-core coprocessors. Early experience using the open-source Scalasca toolset for runtime summarization and automatic trace analysis with the NPB BT-MZ MPI+OpenMP parallel application on Stampede is reported, along with discussion of on-going and future work.","PeriodicalId":426819,"journal":{"name":"Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery","volume":"31 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130439419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Karimabadi, B. Loring, P. O’leary, A. Majumdar, M. Tatineni, Berk Geveci
{"title":"In-situ visualization for global hybrid simulations","authors":"H. Karimabadi, B. Loring, P. O’leary, A. Majumdar, M. Tatineni, Berk Geveci","doi":"10.1145/2484762.2484822","DOIUrl":"https://doi.org/10.1145/2484762.2484822","url":null,"abstract":"Petascale simulations have become mission critical in diverse areas of science and engineering. Knowledge discovery from such simulations remains a major challenge and is becoming more urgent as the march towards ultra-scale computing with millions of cores continues. One major issue with the current paradigm of running the simulations and saving the data to disk for post-processing is that it is only feasible to save the data at a small number of time slices. This low temporal resolution of the saved data is a serious handicap in many studies where the time evolution of the system is of principle interest. One way to address this I/O issue is through in-situ visualization strategies. The idea is to minimize data storage by extracting important features of the data and saving them, rather than raw data, at high temporal resolution. Parallel file systems of current petascale and future exascale systems are expensive shared resources and need to be utilized effectively, and similarly archival storage can be limited and both of these will benefit from in-situ visualization as it will lead to intelligent way of utilizing storage. In this paper, we present preliminary results from our in-situ visualization for global hybrid (electron fluid, kinetic ions) simulations which are used to study the interaction of the solar wind with planetary magnetospheres such as the Earth and Mercury. In particular, we examine the overhead and effect on code performance associated with the inline computations associated with in-situ visualization.","PeriodicalId":426819,"journal":{"name":"Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117153109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Henry Neeman, Zane Gray, D. Brunson, Eddie Huebsch, David Horton, James Deaton, Debi Gentis
{"title":"The Oklahoma cyberinfrastructure initiative","authors":"Henry Neeman, Zane Gray, D. Brunson, Eddie Huebsch, David Horton, James Deaton, Debi Gentis","doi":"10.1145/2484762.2484793","DOIUrl":"https://doi.org/10.1145/2484762.2484793","url":null,"abstract":"The Oklahoma Cyberinfrastructure Initiative (OCII) is a mechanism by which institutions in the state can share resources, both physical and human, to enable research and education statewide to utilize advanced computing technologies. OCII provides eight kinds of service: access to cyberinfrastructure; dissemination via an annual conference that has reached over 2500 participants in 11 years; education via a workshop series in person and via videoconferencing; faculty/staff development via summer weeklong workshops; outreach via a supercomputing talk suitable for non-technical audiences; proposal support in the form of both letters of commitment and direct collaboration; technology acquired for institutions or assisting those institutions in acquiring it; workforce development in the form of a mentorship program for Information Technology and Computer Science students statewide. To date, OCII has reached 50 academic and 47 non-academic institutions and organizations.","PeriodicalId":426819,"journal":{"name":"Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121297736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Charng-Da Lu, J. Browne, R. L. Deleon, John L. Hammond, W. Barth, T. Furlani, S. Gallo, Matthew D. Jones, A. Patra
{"title":"Comprehensive job level resource usage measurement and analysis for XSEDE HPC systems","authors":"Charng-Da Lu, J. Browne, R. L. Deleon, John L. Hammond, W. Barth, T. Furlani, S. Gallo, Matthew D. Jones, A. Patra","doi":"10.1145/2484762.2484781","DOIUrl":"https://doi.org/10.1145/2484762.2484781","url":null,"abstract":"This paper presents a methodology for comprehensive job level resource use measurement and analysis and applications of the analyses to planning for HPC systems and a case study application of the methodology to the XSEDE Ranger and Lonestar4 systems at the University of Texas. The steps in the methodology are: System-wide collection of resource use and performance statistics at the job and node levels, mapping and storage of the resultant job-wise data to a relational database which eases further implementation and transformation of data to the formats required by specific statistical and analytical algorithms. Analyses can be carried out at different levels of granularity: job, user, or system-wide basis. Measurements are based on a novel lightweight job-centric measurement tool \"TACC_Stats\" [1], which gathers a comprehensive set of metrics on all compute nodes. The data mapping and analysis tools will be an extension to the XDMoD project [2] for the XSEDE community. This paper also reports the preliminary results from the analysis of measured data for Texas Advanced Computing Center's Lonestar4 and Ranger supercomputers. The case studies presented indicate the level of detailed information that will be available for all resources when TACC_Stats is deployed throughout the XSEDE system. The methodology can be applied to any system that runs the TACC_Stats measurement tool.","PeriodicalId":426819,"journal":{"name":"Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115237899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}