2012 IEEE 8th International Conference on E-Science最新文献_第6页

High-performance data management for genome sequencing centers using Globus Online: A case study 使用Globus Online的基因组测序中心的高性能数据管理:一个案例研究

2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/eScience.2012.6404443

Dinanath Sulakhe, R. Kettimuthu, Utpal J. Davé

{"title":"High-performance data management for genome sequencing centers using Globus Online: A case study","authors":"Dinanath Sulakhe, R. Kettimuthu, Utpal J. Davé","doi":"10.1109/eScience.2012.6404443","DOIUrl":"https://doi.org/10.1109/eScience.2012.6404443","url":null,"abstract":"In the past few years in the biomedical field, availability of low-cost sequencing methods in the form of next-generation sequencing has revolutionized the approaches life science researchers are undertaking in order to gain a better understanding of the causative factors of diseases. With biomedical researchers getting many of their patients' DNA and RNA sequenced, sequencing centers are working with hundreds of researchers with terabytes to petabytes of data for each researcher. The unprecedented scale at which genomic sequence data is generated today by high-throughput technologies requires sophisticated and high-performance methods of data handling and management. For the most part, however, the state of the art is to use hard disks to ship the data. As data volumes reach tens or even hundreds of terabytes, such approaches become increasingly impractical. Data stored on portable media can be easily lost, and typically is not readily accessible to all members of the collaboration. In this paper, we discuss the application of Globus Online within a sequencing facility to address the data movement and management challenges that arise as a result of exponentially increasing amount of data being generated by a rapidly growing number of research groups. We also present the unique challenges in applying a Globus Online solution in sequencing center environments and how we overcome those challenges.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"127 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90263422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

From scripts towards provenance inference 从文字到出处推断

2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/ESCIENCE.2012.6404467

M. R. Huq, P. Apers, A. Wombacher, Y. Wada, L. V. Beek

引用次数: 2

Remote phenology: Applying machine learning to detect phenological patterns in a cerrado savanna 远程物候学:应用机器学习来检测塞拉多稀树草原的物候模式

2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/ESCIENCE.2012.6404438

J. Almeida, J. A. D. Santos, Bruna Alberton, R. Torres, L. Morellato

{"title":"Remote phenology: Applying machine learning to detect phenological patterns in a cerrado savanna","authors":"J. Almeida, J. A. D. Santos, Bruna Alberton, R. Torres, L. Morellato","doi":"10.1109/ESCIENCE.2012.6404438","DOIUrl":"https://doi.org/10.1109/ESCIENCE.2012.6404438","url":null,"abstract":"Plant phenology has gained importance in the context of global change research, stimulating the development of new technologies for phenological observation. Digital cameras have been successfully used as multi-channel imaging sensors, providing measures of leaf color change information (RGB channels), or leafing phenological changes in plants. We monitored leaf-changing patterns of a cerrado-savanna vegetation by taken daily digital images. We extract RGB channels from digital images and correlated with phenological changes. Our first goals were: (1) to test if the color change information is able to characterize the phenological pattern of a group of species; and (2) to test if individuals from the same functional group may be automatically identified using digital images. In this paper, we present a machine learning approach to detect phenological patterns in the digital images. Our preliminary results indicate that: (1) extreme hours (morning and afternoon) are the best for identifying plant species; and (2) different plant species present a different behavior with respect to the color change information. Based on those results, we suggest that individuals from the same functional group might be identified using digital images, and introduce a new tool to help phenology experts in the species identification and location on-the-ground.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"226 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83612923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Fast confidential search for bio-medical data using Bloom filters and Homomorphic Cryptography 快速机密搜索生物医学数据使用布隆过滤器和同态密码学

2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/eScience.2012.6404484

H. Perl, Yassene Mohammed, Michael Brenner, Matthew Smith

{"title":"Fast confidential search for bio-medical data using Bloom filters and Homomorphic Cryptography","authors":"H. Perl, Yassene Mohammed, Michael Brenner, Matthew Smith","doi":"10.1109/eScience.2012.6404484","DOIUrl":"https://doi.org/10.1109/eScience.2012.6404484","url":null,"abstract":"Data protection is a challenge when outsourcing medical analysis, especially if one is dealing with patient related data. While securing transfer channels is possible using encryption mechanisms, protecting the data during analyses is difficult as it usually involves processing steps on the plain data. A common use case in bioinformatics is when a scientist searches for a biological sequence of amino acids or DNA nucleotides in a library or database of sequences to identify similarities. Most such search algorithms are optimized for speed with less or no consideration for data protection. Fast algorithms are especially necessary because of the immense search space represented for instance by the genome or proteome of complex organisms. We propose a new secure exact term search algorithm based on Bloom filters. Our algorithm retains data privacy by using Obfuscated Bloom filters while maintaining the performance needed for real-life applications. The results can then be further aggregated using Homomorphic Cryptography to allow exact-match searching. The proposed system facilitates outsourcing exact term search of sensitive data to on-demand resources in a way which conforms to best practice of data protection.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"20 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73072583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Overview of the TriBITS lifecycle model: A Lean/Agile software lifecycle model for research-based computational science and engineering software TriBITS生命周期模型概述:基于研究的计算科学和工程软件的精益/敏捷软件生命周期模型

2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/eScience.2012.6404448

R. Bartlett, M. Heroux, J. Willenbring

{"title":"Overview of the TriBITS lifecycle model: A Lean/Agile software lifecycle model for research-based computational science and engineering software","authors":"R. Bartlett, M. Heroux, J. Willenbring","doi":"10.1109/eScience.2012.6404448","DOIUrl":"https://doi.org/10.1109/eScience.2012.6404448","url":null,"abstract":"Software lifecycles are becoming an increasingly important issue for computational science & engineering (CSE) software. The process by which a piece of CSE software begins life as a set of research requirements and then matures into a trusted high-quality capability is both commonplace and extremely challenging. Although an implicit lifecycle is obviously being used in any effort, the challenges of this process-respecting the competing needs of research vs. production-cannot be overstated. Here we describe a proposal for a well-defined software life-cycle process based on modern Lean/Agile software engineering principles. What we propose is appropriate for many CSE software projects that are initially heavily focused on research but also are expected to eventually produce usable high-quality capabilities. The model is related to TriBITS, a build, integration and testing system, which serves as a strong foundation for this lifecycle model, and aspects of this lifecycle model are ingrained in the TriBITS system. Indeed this lifecycle process, if followed, will enable large-scale sustainable integration of many complex CSE software efforts across several institutions.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"43 5 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88850335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

FRED Navigator: An interactive system for visualizing results from large-scale epidemic simulations FRED Navigator:用于可视化大规模流行病模拟结果的交互式系统

2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/ESCIENCE.2012.6404444

Jack Paparian, Shawn T. Brown, D. Burke, J. Grefenstette

引用次数: 8

X-ray imaging software tools for HPC clusters and the Cloud 用于高性能计算集群和云的x射线成像软件工具

2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/ESCIENCE.2012.6404464

D. Thompson, A. Khassapov, Y. Nesterets, T. Gureyev, John A. Taylor

{"title":"X-ray imaging software tools for HPC clusters and the Cloud","authors":"D. Thompson, A. Khassapov, Y. Nesterets, T. Gureyev, John A. Taylor","doi":"10.1109/ESCIENCE.2012.6404464","DOIUrl":"https://doi.org/10.1109/ESCIENCE.2012.6404464","url":null,"abstract":"Computed Tomography (CT) is a non-destructive imaging technique widely used across many scientific, industrial and medical fields. It is both computationally and data intensive, and therefore can benefit from infrastructure in the “supercomputing” domain for research purposes, such as Synchrotron science. Our group within CSIRO has been actively developing X-ray tomography and image processing software and systems for HPC clusters. We have also leveraged the use of GPU's (Graphical Processing Units) for several codes enabling speedups by an order of magnitude or more over CPU-only implementations. A key goal of our systems is to enable our targeted “end users”, researchers, easy access to the tools, computational resources and data via familiar interfaces and client applications such that specialized HPC expertise and support is generally not required in order to initiate and control data processing, analysis and visualzation workflows. We have strived to enable the use of HPC facilities in an interactive fashion, similar to the familiar Windows desktop environment, in contrast to the traditional batch-job oriented environment that is still the norm at most HPC installations. Several collaborations have been formed, and we currently have our systems deployed on two clusters within CSIRO, Australia. A major installation at the Australian Synchrotron (MASSIVE GPU cluster) where the system has been integrated with the Imaging and Medical Beamline (IMBL) detector to provide rapid on-demand CT-reconstruction and visualization capabilities to researchers whilst on-site and remotely. A smaller-scale installation has also been deployed on a mini-cluster at the Shanghai Synchrotron Radiation Facility (SSRF) in China. All clusters run the Windows HPC Server 2008 R2 operating system. The two large clusters running our software, MASSIVE and CSIRO Bragg are currently configured as “hybrid clusters” in which individual nodes can be dual-booted between Linux and Windows as demand requires. We have also recently explored the adaptation of our CT-reconstruction code to Cloud infrastructure, and have constructed a working “proof-of-concept” system for the Microsoft Azure Cloud. However, at this stage several challenges remain to be met in order to make it a truly viable alternative to our HPC cluster solution. Recently, CSIRO was successful in its proposal to develop eResearch tools for the Australian Government funded NeCTAR Research Cloud. As part of this project our group will be contributing CT and imaging processing components.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"7 1","pages":"1-7"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80449474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

GridFTP based real-time data movement architecture for x-ray photon correlation spectroscopy at the Advanced Photon Source 基于GridFTP的先进光子源x射线光子相关光谱实时数据移动架构

2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/eScience.2012.6404466

S. Narayanan, T. Madden, A. Sandy, R. Kettimuthu, M. Link

引用次数: 2

On realizing the concept study ScienceSoft of the European Middleware Initiative: Open Software for Open Science 关于实现欧洲中间件计划的概念研究ScienceSoft:开放科学的开放软件

2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/eScience.2012.6404450

A. D. Meglio, F. Estrella, M. Riedel

{"title":"On realizing the concept study ScienceSoft of the European Middleware Initiative: Open Software for Open Science","authors":"A. D. Meglio, F. Estrella, M. Riedel","doi":"10.1109/eScience.2012.6404450","DOIUrl":"https://doi.org/10.1109/eScience.2012.6404450","url":null,"abstract":"In September 2011 the European Middleware Initiative (EMI) started discussing the feasibility of creating an open source community for science with other projects like EGI, StratusLab, OpenAIRE, iMarine, and IGE, SMEs like DCore, Maat, SixSq, SharedObjects, communities like WLCG and LSGC. The general idea of establishing an open source community dedicated to software for scientific applications was understood and appreciated by most people. However, the lack of a precise definition of goals and scope is a limiting factor that has also made many people sceptical of the initiative. In order to understand more precisely what such an open source initiative should do and how, EMI has started a more formal feasibility study around a concept called ScienceSoft - Open Software for Open Science. A group of people from interested parties was created in December 2011 to be the ScienceSoft Steering Committee with the short-term mandate to formalize the discussions about the initiative and produce a document with an initial high-level description of the motivations, issues and possible solutions and a general plan to make it happen. The conclusions of the initial investigation were presented at CERN in February 2012 at a ScienceSoft Workshop organized by EMI. Since then, presentations of ScienceSoft have been made in various occasions, in Amsterdam in January 2012 at the EGI Workshop on Sustainability, in Taipei in February at the ISGC 2012 conference, in Munich in March at the EGI/EMI Conference and at OGF 34 in March. This paper provides information this concept study ScienceSoft as an overview distributed to the broader scientific community to critique it.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"30 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74068591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Provenance analysis: Towards quality provenance 种源分析:走向优质种源

2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/eScience.2012.6404480

Y. Cheah, Beth Plale

引用次数: 28