2012 IEEE 8th International Conference on E-Science最新文献

筛选
英文 中文
Lessons learned from Galaxy, a Web-based platform for high-throughput genomic analyses 从基于网络的高通量基因组分析平台Galaxy获得的经验教训
2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/ESCIENCE.2012.6404442
Jeremy Goecks, The Galaxy Team, A. Nekrutenko, James Taylor
{"title":"Lessons learned from Galaxy, a Web-based platform for high-throughput genomic analyses","authors":"Jeremy Goecks, The Galaxy Team, A. Nekrutenko, James Taylor","doi":"10.1109/ESCIENCE.2012.6404442","DOIUrl":"https://doi.org/10.1109/ESCIENCE.2012.6404442","url":null,"abstract":"High throughput sequencing assays have given rise to the field of genomics and transformed biomedical research into a computational science. Due to the large size of genomics datasets, high-performance computing is essential for analysis. Galaxy (http://galaxyproject.org) is a popular Web-based platform that can be used for all facets of genomic analyses, including data retrieval and integration, multi-step analysis, repeated analyses via workflows, visualization, collaboration, and publication. This paper describes Galaxy and discusses four lessons learned from the development of Galaxy. First, Galaxy uses open, extensible frameworks so that it can be adapted to new technologies as they become available. Second, by leveraging Web technologies, Galaxy makes genomics tools accessible to everyone and provides a common platform for collaboration. Third, Galaxy fosters community amongst both developers and users and encourages each community to adapt and extend Galaxy to meet their needs. Finally, Galaxy software development and genomic research are closely coupled, and challenges encountered during genomic research drive Galaxy development.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"51 4 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91014513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
IPOL: Reviewed publication and public testing of research software IPOL:审查了研究软件的出版和公开测试
2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/eScience.2012.6404449
Nicolas Limare, L. Oudre, Pascal Getreuer
{"title":"IPOL: Reviewed publication and public testing of research software","authors":"Nicolas Limare, L. Oudre, Pascal Getreuer","doi":"10.1109/eScience.2012.6404449","DOIUrl":"https://doi.org/10.1109/eScience.2012.6404449","url":null,"abstract":"With the journal Image Processing On Line (IPOL), we propose to promote software to the status of regular research material and subject it to the same treatment as research papers: it must be reviewed, it must be reusable and verifiable by the research community, it must follow style and quality guidelines. In IPOL, algorithms are published with their implementation, codes are peer-reviewed, and a web-based test interface is attached to each of these articles. This results in more software released by the researchers, a better software quality achieved with the review process, and a large collection of test data gathered for each article. IPOL has been active since 2010, and has already published thirty articles.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"75 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77418661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Prediction of protein solubility in E. coli 蛋白质在大肠杆菌中的溶解度预测
2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/eScience.2012.6404416
T. Samak, D. Gunter, Zhong Wang
{"title":"Prediction of protein solubility in E. coli","authors":"T. Samak, D. Gunter, Zhong Wang","doi":"10.1109/eScience.2012.6404416","DOIUrl":"https://doi.org/10.1109/eScience.2012.6404416","url":null,"abstract":"Gene synthesis is a key step to convert digitally predicted proteins to functional proteins. However, it is a relatively expensive and labor-intensive process. About 30-50% of the synthesized proteins are not soluble, thereby further reduces the efficacy of gene synthesis as a method for protein function characterization. Solubility prediction from primary protein sequences holds the promise to dramatically reduce the cost of gene synthesis. This work presents a framework that creates models of solubility from sequence information. From the primary protein sequences of the genes to be synthesized, sequence features can be used to build computational models for solubility. This way, biologists can focus the effort on synthesizing genes that are highly likely to generate soluble proteins. We have developed a framework that employs several machine learning algorithms to model protein solubility. The framework is used to predict protein solubility in the Escherichia coli expression system. The analysis is performed on over 1,600 quantified proteins. The approach successfully predicted the solubility with more than 80% accuracy, and enabled in depth analysis of the most important features affecting solubility. The analysis pipeline is general and can be applied to any set of sequence features to predict any binary measure. The framework also provides the biologist with a comprehensive comparison between different learning algorithms, and insightful feature analysis.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"10 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75740407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Dynamic network provisioning for data intensive applications in the cloud 为云中的数据密集型应用程序提供动态网络配置
2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/eScience.2012.6404461
P. Ruth, A. Mandal, Yufeng Xin, I. Baldine, Chris Heermann, J. Chase
{"title":"Dynamic network provisioning for data intensive applications in the cloud","authors":"P. Ruth, A. Mandal, Yufeng Xin, I. Baldine, Chris Heermann, J. Chase","doi":"10.1109/eScience.2012.6404461","DOIUrl":"https://doi.org/10.1109/eScience.2012.6404461","url":null,"abstract":"Advanced networks are an essential element of data-driven science enabled by next generation cyberinfrastructure environments. Computational activities increasingly incorporate widely dispersed resources with linkages among software components spanning multiple sites and administrative domains. We have seen recent advances in enabling on-demand network circuits in the national and international backbones coupled with Software Defined Networking (SDN) advances like OpenFlow and programmable edge technologies like OpenStack. These advances have created an unprecedented opportunity to enable complex scientific applications to run on specially tailored, dynamic infrastructure that include compute, storage and network resources, combining the performance advantages of purpose-built infrastructures, but without the costs of a permanent infrastructure. This work presents an experience deploying scientific workflows on the ExoGENI national test bed that dynamically allocates computational resources with high-speed circuits from backbone providers. Dynamically allocated bandwidth-provisioned high-speed circuits increase the ability of scientific applications to access and stage large data sets from remote data repositories or to move computation to remote sites and access data stored locally. The remainder of this extended abstract is a brief description of the test bed and several scientific workflow applications that were deployed using bandwidth-provisioned high-speed circuits.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"28 1","pages":"1-2"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74739470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Enabling scientific data sharing and re-use 促进科学数据共享和再利用
2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/eScience.2012.6404475
B. Minsker, T. Wietsma
{"title":"Enabling scientific data sharing and re-use","authors":"B. Minsker, T. Wietsma","doi":"10.1109/eScience.2012.6404475","DOIUrl":"https://doi.org/10.1109/eScience.2012.6404475","url":null,"abstract":"Research data sharing is one of the key challenges in the e-science era. IT technologies facilitate an enhanced management and sharing of research data. It is crucial to understand the current status of research data sharing in order to facilitate enhanced data sharing in the future. In this study, a conceptual model has been developed to characterize the process of data sharing and the factors which give rise to variations in data re-use. The study goes beyond a solely technical analysis and includes also psychological, social, organizational, legal and political components. The model was developed based on the literature and 21 face to face interviews with research, funding, data centre and publishing experts. It was validated by both a vigorous workshop and a further 55 structured telephone interviews. The overall model identifies sub-models of process, of context, and of drivers, barriers and enablers. These provide a comprehensive description of the factors that enable or inhibit the sharing of research data. They affect whether data are shared, how they are shared, and how successfully they are shared. Implementing the enablers will help the research community overcome the barriers to data re-use to facilitate future e-science endeavors.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"14 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83723999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Collaborative information management in scientific research processes 科研过程中的协同信息管理
2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/ESCIENCE.2012.6404478
S. Crompton, B. Matthews, Erica Y. Yang, C. Neylon, S. Coles
{"title":"Collaborative information management in scientific research processes","authors":"S. Crompton, B. Matthews, Erica Y. Yang, C. Neylon, S. Coles","doi":"10.1109/ESCIENCE.2012.6404478","DOIUrl":"https://doi.org/10.1109/ESCIENCE.2012.6404478","url":null,"abstract":"Research is an incremental process that both generates and consumes diverse artifacts over its lifetime. A typical research lifecycle may involve creating experimental or observational data using multiple facilities or instruments; refining raw data into derived data to test hypotheses; publishing and presenting the findings in various formats. Each stage of this process commonly involves support systems with independent management; this however hinders e-scholarship as human mediation is required to track and access related research outputs. In this paper, we describe a collaborative research information management infrastructure based on STFC facilities. The pilot system uses the InteRCom peer-to-peer protocol to propagate typed links between digital contents spread across repositories. The resultant linked web of data offers a simple but versatile solution to the tracking of research outputs in context, as these semantically annotated links form a graph of citation and provenance which can be analyzed, traversed or aggregated according to the link resource or property of interest.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"34 1","pages":"1-7"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88036806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Open Social based group access control framework for e-Science data infrastructure 基于开放社会的e-Science数据基础设施组访问控制框架
2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/eScience.2012.6404488
Hui Zhang, Wenjun Wu, ZhenAn Li
{"title":"Open Social based group access control framework for e-Science data infrastructure","authors":"Hui Zhang, Wenjun Wu, ZhenAn Li","doi":"10.1109/eScience.2012.6404488","DOIUrl":"https://doi.org/10.1109/eScience.2012.6404488","url":null,"abstract":"In an e-Science data infrastructure, access control is a vital component to facilitate the management of the collective data and computing resources shared by researchers from geographically distributed locations. But conventional virtual organization based access control frameworks are not suitable for self-organizing, ad-hoc and opportunistic scientific collaborations, in which scientists can easily set up group-oriented authorization rules across the administrative domains. Using the emerging OAuth2.0 protocol, this paper introduces a novel Open Social based access control framework to support ad-hoc team formation and user-controlled resource sharing. Our experiences with development of the framework in e-Science data infrastructure projects demonstrate that the proposed framework is a very promising approach to resource sharing in cross-domain e-science environments.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"5 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91002914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Enabling large genomic data transfers using nation-wide and international dynamic lightpaths 使用全国和国际动态光路实现大型基因组数据传输
2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/eScience.2012.6404458
J. Bot, M. D. Vos, S. Boele, M. Reinders, J. Kok
{"title":"Enabling large genomic data transfers using nation-wide and international dynamic lightpaths","authors":"J. Bot, M. D. Vos, S. Boele, M. Reinders, J. Kok","doi":"10.1109/eScience.2012.6404458","DOIUrl":"https://doi.org/10.1109/eScience.2012.6404458","url":null,"abstract":"The recent advances made in high throughput genomic sequencing allow researchers to accurately determine the genetic make-up of an individual. Sharing this data across research institutes has proven to be challenging as the amount of data and available bandwidth cause large delays. Here, we present a network of dynamic lightpaths dedicated to the life sciences which connects research groups within the Netherlands to each other, to compute and storage providers and to commercial partners.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"53 1","pages":"1-2"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76262478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Calibration of watershed models using cloud computing 基于云计算的流域模型校正
2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/ESCIENCE.2012.6404420
M. Humphrey, N. Beekwilder, J. Goodall, M. Ercan
{"title":"Calibration of watershed models using cloud computing","authors":"M. Humphrey, N. Beekwilder, J. Goodall, M. Ercan","doi":"10.1109/ESCIENCE.2012.6404420","DOIUrl":"https://doi.org/10.1109/ESCIENCE.2012.6404420","url":null,"abstract":"Understanding hydrologic systems at the scale of large watersheds and river basins is critically important to society when faced with extreme events, such as floods and droughts, or with concerns about water quality. A critical requirement of watershed modeling is model calibration, in which the computational model's parameters are varied during a search algorithm in order to find the best match against physically-observed phenomena such as streamflow. Because it is generally performed on a laptop computer, this calibration phase can be very time-consuming, significantly limiting the ability of a hydrologist to experiment with different models. In this paper, we describe our system for watershed model calibration using cloud computing, specifically Microsoft Windows Azure. With a representative watershed model whose calibration takes 11.4 hours on a commodity laptop, our cloud-based system calibrates the watershed model in 43.32 minutes using 16 cloud cores (15.78x speedup), 11.76 minutes using 64 cloud cores (58.13x speedup), and 5.03 minutes using 256 cloud cores (135.89x speedup). We believe that such speed-ups offer the potential toward real-time interactive model creation with continuous calibration, ushering in a new paradigm for watershed modeling.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"3 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73145695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
An integrated science portal for collaborative compute and data intensive protein structure studies 一个集成的科学门户网站,用于协同计算和数据密集型蛋白质结构研究
2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/ESCIENCE.2012.6404425
I. Stokes-Rees, D. O'Donovan, Peter Doherty, Meghan Porter-Mahoney, P. Śliż
{"title":"An integrated science portal for collaborative compute and data intensive protein structure studies","authors":"I. Stokes-Rees, D. O'Donovan, Peter Doherty, Meghan Porter-Mahoney, P. Śliż","doi":"10.1109/ESCIENCE.2012.6404425","DOIUrl":"https://doi.org/10.1109/ESCIENCE.2012.6404425","url":null,"abstract":"The SBGrid Science Portal provides multi-modal access to computational infrastructure, data storage, and data analysis tools for the structural biology community. It incorporates features not previously seen in cyberinfrastructure science gateways. It enables researchers to securely share a computational study area, including large volumes of data and active computational workflows. A rich identity management system has been developed that simplifies federated access to US national cyberinfrastructure, distributed data storage, and high performance file transfer tools. It integrates components from the Virtual Data Toolkit, Condor, glideinWMS, the Globus Toolkit and Globus Online, the FreeIPA identity management system, Apache web server, and the Django web framework.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"22 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74037443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信