Adam Lathers, Mei-Hui Su, A. Kulungowski, A. Lin, Gaurang Mehta, S. Peltier, E. Deelman, Mark Ellisman
{"title":"Enabling parallel scientific applications with workflow tools","authors":"Adam Lathers, Mei-Hui Su, A. Kulungowski, A. Lin, Gaurang Mehta, S. Peltier, E. Deelman, Mark Ellisman","doi":"10.1109/CLADE.2006.1652055","DOIUrl":null,"url":null,"abstract":"Electron tomography is a powerful tool for deriving three-dimensional (3D) structural information about biological systems within the spatial scale spanning 1 nm3 and 10 mm3. With this technique, it is possible to derive detailed models of sub-cellular components such as organelles and synaptic complexes and to resolve the 3D distribution of their protein constituents in situ. Due in part to exponentially growing raw data-sizes, there continues to be a need for the increased integration of high-performance computing (HPC) and grid technologies with traditional electron tomography processes to provide faster data processing throughput. This is increasingly relevant because emerging mathematical algorithms that provide better data fidelity are more computationally intensive for larger raw data sizes. Progress has been made towards the transparent use of HPC and grid tools for launching scientific applications without passing on the necessary administrative overhead and complexity (resource administration, authentication, scheduling, data delivery) to the non-computer scientist end-user. There is still a need, however, to simplify the use of these tools for applications developers who are developing novel algorithms for computation. Here we describe the architecture of the Telescience project (http://telescience.ucsd.edu), specifically the use of layered workflow technologies to parallelize and execute scientific codes across a distributed and heterogeneous computational resource pool (including resources from the TeraGrid and OptlPuter projects) without the need for the application developer to understand the intricacies of the grid","PeriodicalId":299480,"journal":{"name":"2006 IEEE Challenges of Large Applications in Distributed Environments","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 IEEE Challenges of Large Applications in Distributed Environments","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLADE.2006.1652055","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 25
Abstract
Electron tomography is a powerful tool for deriving three-dimensional (3D) structural information about biological systems within the spatial scale spanning 1 nm3 and 10 mm3. With this technique, it is possible to derive detailed models of sub-cellular components such as organelles and synaptic complexes and to resolve the 3D distribution of their protein constituents in situ. Due in part to exponentially growing raw data-sizes, there continues to be a need for the increased integration of high-performance computing (HPC) and grid technologies with traditional electron tomography processes to provide faster data processing throughput. This is increasingly relevant because emerging mathematical algorithms that provide better data fidelity are more computationally intensive for larger raw data sizes. Progress has been made towards the transparent use of HPC and grid tools for launching scientific applications without passing on the necessary administrative overhead and complexity (resource administration, authentication, scheduling, data delivery) to the non-computer scientist end-user. There is still a need, however, to simplify the use of these tools for applications developers who are developing novel algorithms for computation. Here we describe the architecture of the Telescience project (http://telescience.ucsd.edu), specifically the use of layered workflow technologies to parallelize and execute scientific codes across a distributed and heterogeneous computational resource pool (including resources from the TeraGrid and OptlPuter projects) without the need for the application developer to understand the intricacies of the grid