{"title":"Teaching and Learning Graph Algorithms Using Animation","authors":"Y. Liang","doi":"10.22369/ISSN.2153-4136/9/2/3","DOIUrl":"https://doi.org/10.22369/ISSN.2153-4136/9/2/3","url":null,"abstract":"Graph algorithms have many applications. Many real-world problems can be solved using graph algorithms. Graph algorithms are commonly taught in the data structures, algorithms, and discrete mathematics courses. We have created two animations to visually demonstrate the graph algorithms. The first animation is for depthfirst search, breadth-first search, shortest paths, connected components, finding bipartite sets, and Hamiltonian path/cycle on unweighted graphs. The second animation is for the minimum spanning trees, shortest paths, travelling salesman problems on weighted graphs. The animations are developed using HTML, CSS, and JavaScript and are platform independent. They can be viewed from a browser on any device. The animations are useful tools for teaching and learning graph algorithms. This paper presents these animations.","PeriodicalId":330804,"journal":{"name":"The Journal of Computational Science Education","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124250708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Physics Conceptual Understanding in a Computational Science Course","authors":"Rivka Taub, M. Armoni, M. Ben-Ari","doi":"10.22369/ISSN.2153-4136/9/2/1","DOIUrl":"https://doi.org/10.22369/ISSN.2153-4136/9/2/1","url":null,"abstract":"Students face many difficulties dealing with physics principles and concepts during physics problem solving. For example, they lack the understanding of the components of formulas, as well as of the physical relationships between the two sides of a formula. To overcome these difficulties some educators have suggested integrating simulations design into physics learning. They claim that the programming process necessarily fosters understanding of the physics underlying the simulations. We investigated physics learning in a high-school course on computational science. The course focused on the development of computational models of physics phenomena and programming corresponding simulations. The study described in this paper deals with the development of students' conceptual physics knowledge throughout the course. Employing a qualitative approach, we used concept maps to evaluate students' physics conceptual knowledge at the beginning and the end of the model development process, and at different stages in between. We found that the students gained physics knowledge that has been reported to be difficult for high-school and even undergraduate students. We use two case studies to demonstrate our method of analysis and its outcomes. We do that by presenting a detailed analysis of two projects in which computational models and simulations of physics phenomena were developed.","PeriodicalId":330804,"journal":{"name":"The Journal of Computational Science Education","volume":"2012 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114449994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parsing Next Generation Sequencing Data in Parallel Environments for Downstream Genetic Variation Analysis","authors":"Mariana Vasquez, J. Mohl, M. Leung","doi":"10.22369/issn.2153-4136/9/2/5","DOIUrl":"https://doi.org/10.22369/issn.2153-4136/9/2/5","url":null,"abstract":"With the recent advances in next generation sequencing technology, analysis of prevalent DNA sequence variants from patients with a particular disease has become an important tool for understanding the associations between the disease and genetic mutations. A publicly accessible bioinformatics pipeline, called OncoMiner (http://oncominer.utep.edu), was implemented in 2016 to help biomedical researchers analyze large genomic datasets from patients with cancer. However, the current version of OncoMiner can only accept input files with a highly specific format for sequence variant description. In order to handle data from a broader range of sequencing platforms, a data preprocessing tool is necessary. We have therefore implemented the OncoMiner Preprocessing (OP) program for parsing data files in the popular FastQ and BAM formats to generate an OncoMiner input file. OP involves using the open source Bowtie2 and SAMtools software, followed by a python script we developed for genetic sequence variant identification. To preprocess very large datasets efficiently, the OP program has been parallelized on two local computers and the Blue Waters system at the National Center for Supercomputing Applications using a multiprocessing approach. Although reasonable parallelization efficiency has been obtained on the local computers, the OP program’s speedup on Blue Waters has been limited, possibly due to I/O issues and individual node memory constraints. Despite these, Blue Waters has provided the necessary resources to process 35 datasets from patients with acute myeloid leukemia and demonstrated significant correlation of OP runtimes with the BAM input size and chromosome diversity.","PeriodicalId":330804,"journal":{"name":"The Journal of Computational Science Education","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125744530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Luke, Sarah Fergione, Riley Wilson, B. Gunn, S. Svojanovsky
{"title":"Identification of Active Oligonucleotide Sequences Using Artificial Neural Network","authors":"A. Luke, Sarah Fergione, Riley Wilson, B. Gunn, S. Svojanovsky","doi":"10.22369/ISSN.2153-4136/9/2/4","DOIUrl":"https://doi.org/10.22369/ISSN.2153-4136/9/2/4","url":null,"abstract":"In this project we designed an Artificial Neural Network (ANN) computational model to predict the activity of short oligonucleotide sequences (octamers) with important biological role as exonic splicing enhancers (ESE) motifs recognized by human SR protein SC35. Since only active sequences were available from the literature as our initial data set, we generated an additional set of complementary sequences to the original set. We used back-propagation neural network (BPNN) with MATLAB® Neural Network ToolboxTM on our research designated computer. In Stage I of our project we trained, validated and tested the BPNN prototype. We started with 20 samples in the training and 8 samples in the validation sets. Trained and validated BPNN prototype was then used to test the unique set of 10 octamer sequences with 5 active samples and their 5 complementary sequences. The test showed 2 classification errors, one false positive and the other false negative. We used the test data and moved into Stage II of the project. First, we analyzed the initial DNA numerical representation (DNR) and changed the scheme to achieve higher difference between the subsets of active and complementary sequences. We compared the BPNN results with different numbers of nodes in the second hidden layer to optimize model accuracy. To estimate future model performance we needed to test the classifier on newly collected data from another paper. This practical application included the testing of 41 published, non-repeating SC35 ESE motif octamers, together with 41 complementary sequences. The test showed high BPNN accuracy in the predictive power for both (active and inactive) categories. This study shows the potential for using a BPNN to screen SC35 ESE motif candidates.","PeriodicalId":330804,"journal":{"name":"The Journal of Computational Science Education","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133130299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic Feature Selection in Markov State Models Using Genetic Algorithm","authors":"Qihua Chen, Jiangyan Feng, S. Mittal, D. Shukla","doi":"10.22369/issn.2153-4136/9/2/2","DOIUrl":"https://doi.org/10.22369/issn.2153-4136/9/2/2","url":null,"abstract":"Markov State Models (MSMs) are a powerful framework to reproduce the long-time conformational dynamics of biomolecules using a set of short Molecular Dynamics (MD) simulations. However, precise kinetics predictions of MSMs heavily rely on the features selected to describe the system. Despite the importance of feature selection for large system, determining an optimal set of features remains a difficult unsolved problem. Here, we introduce an automatic approach to optimize feature selection based on genetic algorithms (GA), which adaptively evolves the most fitted solution according to natural selection laws. The power of the GA-based method is illustrated on long atomistic folding simulations of four proteins, varying in length from 28 to 80 residues. Due to the diversity of tested proteins, we expect that our method will be extensible to other proteins and drive MSM building to a more objective protocol.","PeriodicalId":330804,"journal":{"name":"The Journal of Computational Science Education","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124584832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scientific Computing, High-Performance Computing and Data Science in Higher Education","authors":"Marcelo Ponce, Erik Spence, D. Gruner, R. Zon","doi":"10.22369/issn.2153-4136/10/1/5","DOIUrl":"https://doi.org/10.22369/issn.2153-4136/10/1/5","url":null,"abstract":"We present an overview of current academic curricula for Scientific Computing, High-Performance Computing and Data Science. After a survey of current academic and non-academic programs across the globe, we focus on Canadian programs and specifically on the education program of the SciNet HPC Consortium, using its detailed enrollment and course statistics for the past four to five years. Not only do these data display a steady and rapid increase in the demand for research-computing instruction, they also show a clear shift from traditional (high performance) computing to data-oriented methods. It is argued that this growing demand warrants specialized research computing degrees. The possible curricula of such degrees are described next, taking existing programs as an example, and adding SciNet's experiences of student desires as well as trends in advanced research computing.","PeriodicalId":330804,"journal":{"name":"The Journal of Computational Science Education","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128328316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Potential Influence of Prior Experience in an Undergraduate-Graduate Level HPC Course","authors":"C. Fietkiewicz","doi":"10.22369/issn.2153-4136/10/1/15","DOIUrl":"https://doi.org/10.22369/issn.2153-4136/10/1/15","url":null,"abstract":"A course on high performance computing (HPC) at Case Western Reserve University included students with a range of technical and academic experience. We consider these experiential differences with regard to student performance and perceptions. The course relied heavily on C programming and multithreading, but one third of the students had no prior experience with these techniques. Academic experience also varied, as the class included 3rd and 4th year undergraduates, master’s students, PhD students, and a nondegree student. Results indicate that student performance did not depend on technical experience. However, average overall performance was slightly higher for graduate students. Additionally, we report on students’ perceptions of the course and the assigned work.","PeriodicalId":330804,"journal":{"name":"The Journal of Computational Science Education","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127727413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Student Outcomes in Parallelizing Recursive Matrix Multiply","authors":"C. Fietkiewicz","doi":"10.22369/issn.2153-4136/10/1/4","DOIUrl":"https://doi.org/10.22369/issn.2153-4136/10/1/4","url":null,"abstract":"Students in a course on high performance computing were assigned the task of parallelizing an algorithm for recursive matrix multiplication. The objectives of the assignment were to: (1) design a basic approach for incorporating parallel programming into a recursive algorithm, and (2) optimize the speedup. Pseudocode was provided for recursive matrix multiplication, and students were required to first implement a serial version before implementing a parallel version. The parallel version had the following requirements: (1) use OpenMP to perform multithreading, and (2) use exactly 4 threads, where each thread computes one quadrant of the array product. Using a class size of 23 students, including undergraduate and graduate, approximately 70% of the students designed valid parallel solutions, and 13% achieved the optimal speedup of 4×. Common errors included recursively creating excessive threads, failing to parallelize all possible mathematical operations, and poor use of compiler directives for OpenMP.","PeriodicalId":330804,"journal":{"name":"The Journal of Computational Science Education","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123262030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eric Coulter, J. Sprouse, Resa Reynolds, R. Knepper
{"title":"Extending XSEDE Innovations to Campus Cyberinfrastructure - The XSEDE National Integration Toolkit","authors":"Eric Coulter, J. Sprouse, Resa Reynolds, R. Knepper","doi":"10.22369/issn.2153-4136/10/1/3","DOIUrl":"https://doi.org/10.22369/issn.2153-4136/10/1/3","url":null,"abstract":"XSEDE Service Providers (SPs) and resources have the benefit of years of testing and implementation, tuning and configuration, and the development of specific tools to help users and systems make the best use of these resources. Cyberinfrastructure professionals at the campus level are often charged with building computer resources which are compared to these national-level resources. While organizations and companies exist that guide cyberinfrastructure configuration choices down certain paths, there is no easy way to distribute the long-term knowledge of the XSEDE project to campus CI professionals. The XSEDE Cyberinfrastructure Resource Integration team has created a variety of toolkits to enable easy knowledge and best-practice transfer from XESDE SPs to campus CI professionals. The XSEDE National Integration Toolkit (XNIT) provides the software used on most XSEDE systems in an effort to propagate the best practices and knowledge of XSEDE resources. XNIT includes basic tools and configuration that make it simpler for a campus cluster to have the same software set and many of the advantages and XSEDE SP resource affords. In this paper, we will detail the steps taken to build such a library of software and discuss the challenges involved in disseminating awareness of toolkits among cyberinfrastructure professionals. We will also describe our experiences in updating the XNIT to be compatible with the OpenHPC project, which forms the basis of many new HPC systems, and appears situated to become the de-facto choice of management software provider for many HPC centers.","PeriodicalId":330804,"journal":{"name":"The Journal of Computational Science Education","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115747819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Training Computational Scientists to Build and Package Open-Source Software","authors":"P. Bisbal","doi":"10.22369/ISSN.2153-4136/10/1/12","DOIUrl":"https://doi.org/10.22369/ISSN.2153-4136/10/1/12","url":null,"abstract":"High performance computing training and education typically emphasizes the first-principles of scientific programming, such as numerical algorithms and parallel programming techniques. However, many computational scientists need to know how to compile and link to applications built by others. Likewise, those who create the libraries and applications need to understand how to organize their code to make it as portable as possible and package it so that it is straightforward for others to use. These topics are not currently addressed by the current HPC education or training curriculum and users are typically left to develop their own approaches. This work will discuss observations made by the author over the last 20 years regarding the common problems encountered in the scientific community when developing their own codes and building codes written by other computational scientists. Recommendations will be provided for a training curriculum to address these shortcomings","PeriodicalId":330804,"journal":{"name":"The Journal of Computational Science Education","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133535954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}