C. Duffy, Lorne Leonard, G. Bhatt, Xuan Yu, C. Lee Giles
{"title":"Watershed Reanalysis: Towards a National Strategy for Model-Data Integration","authors":"C. Duffy, Lorne Leonard, G. Bhatt, Xuan Yu, C. Lee Giles","doi":"10.1109/ESCIENCEW.2011.32","DOIUrl":"https://doi.org/10.1109/ESCIENCEW.2011.32","url":null,"abstract":"Reanalysis or retrospective analysis is the process of re-analyzing and assimilating climate and weather observations with the current modeling context. Reanalysis is an objective, quantitative method of synthesizing all sources of information (historical and real-time observations) within a unified framework. In this context, we propose a prototype for automated and virtualized web services software using national data products for climate reanalysis, soils, geology, terrain and land cover for the purpose of water resource simulation, prediction, data assimilation, calibration and archival. The prototype for model-data integration focuses on creating tools for fast data storage from selected national databases, as well as the computational resources necessary for a dynamic, distributed watershed prediction anywhere in the continental US. In the future implementation of virtualized services will benefit from the development of a cloud cyber infrastructure as the prototype evolves to data and model intensive computation for continental scale water resource predictions.","PeriodicalId":267737,"journal":{"name":"2011 IEEE Seventh International Conference on e-Science Workshops","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125764446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yassene Mohammed, Shayan Shahand, V. Korkhov, Angela C. M. Luyf, B. V. Schaik, M. Caan, A. V. Kampen, Magnus Palmblad, S. Olabarriaga
{"title":"Data Decomposition in Biomedical e-Science Applications","authors":"Yassene Mohammed, Shayan Shahand, V. Korkhov, Angela C. M. Luyf, B. V. Schaik, M. Caan, A. V. Kampen, Magnus Palmblad, S. Olabarriaga","doi":"10.1109/eScienceW.2011.7","DOIUrl":"https://doi.org/10.1109/eScienceW.2011.7","url":null,"abstract":"As the focus of e-Science is moving toward the forth paradigm and data intensive science, data access remains dependent on the architecture of the used e-Science infrastructure. Such architecture is in general job-driven, i.e., a (grid) job is a sequence of commands that run on the same worker node. Making use of the infrastructure involves having a parallelized application. This is done foremost by data decomposition. In general practice of parallel programming, data decomposition depends on the programmer's experience and knowledge about the used data and the algorithm/application. On the other hand, data mining scientists have an established foundation for data decomposition, automatic decomposition methods are already in use, methodologies and patterns are defined. Our experience in porting biomedical applications to the Dutch e-Science infrastructure shows that the used data decomposition to gain parallelism fit to some degree a subgroup of the data mining decomposition patterns, i.e., object set decomposition. In this paper we discuss porting three biomedical packages to a grid computing environment, two for medical imaging and one for DNA sequencing. We show how the data access of the applications was reengineered around the executables to make use of the parallel capacity of e-Science infrastructure.","PeriodicalId":267737,"journal":{"name":"2011 IEEE Seventh International Conference on e-Science Workshops","volume":"341 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124213989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effective Computational Method for Evaluation of Dynamic Elecrostatic Effects of Explicit Solvent and Membrane Molecules from Molecular Dynamics Simulations","authors":"Y. Yonezawa","doi":"10.1109/eScienceW.2011.18","DOIUrl":"https://doi.org/10.1109/eScienceW.2011.18","url":null,"abstract":"Knowledge of the electronic structures of local functional sites of proteins sheds light into their fundamental mechanisms of enzymatic reaction and processes related to electronic state. Although the dynamic effects due to solvent or membrane molecules surrounding the protein are indispensable for an accurate analysis, in current methods they have been approximated by a continuum model with polarized material, where a phenomenological and unreliable parameter, the dielectric constant, is always required. We have developed a new algorithm to reproduce an average field due to the solvent and membrane molecules, which are calculated from the long trajectory of a classical molecular dynamics simulation for a membrane protein-solvent system, by several thousands of pseudo-charges and dipoles on a closed surface surrounding a target quantum mechanical (QM) region. Since the dynamic effects are represented only by \"static\" pseudo-charges and dipoles, the QM calculation is necessarily done only once. We applied this algorithm to the photosynthetic reaction center of Rhodobacter sphaeroides with explicit all-atomic models of the solvent and membrane molecules. It is possible that the electronic structures of its ground state and excited state can be calculated with those microscopic \"reaction field\" effects.","PeriodicalId":267737,"journal":{"name":"2011 IEEE Seventh International Conference on e-Science Workshops","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128159764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Assessment of Resource Quality for Service Level Agreements in Life Science Grids","authors":"Tibor K´lm´n","doi":"10.1109/ESCIENCEW.2011.17","DOIUrl":"https://doi.org/10.1109/ESCIENCEW.2011.17","url":null,"abstract":"This article focuses on measuring, describing, monitoring and publishing the quality and performance of grid resources. Life science communities can employ Service Level Agreements (SLAs) with their resource providers to ensure the delivery of services. For this, it is important for both the life science communities and their providers to understand and quantify the performance and service quality of different grid environments. However, measuring service quality in grid infrastructures utilizing different middle wares, as in the German Grid Initiative, is a complex problem. We describe the state of quality metrics which are currently used by the German life science communities MediGRID, Services@MediGRID and PneumoGrid. We also identify further quality metrics for defining and monitoring grid resource quality in D-Grid. It is important to publish and exchange the quality information by grid information systems, which are the entry points to grid services. Therefore, we also present how quality information can be handled by the GLUE v2.0 Schema, which is the upcoming standard data model used by grid information systems. For measuring and monitoring the quality metrics in multi-middleware environments two approaches are discussed. The first approach extracts quality information from an external benchmarking system and loads iit to the grid information systems. The second solution targets life science communities that do not utilize legacy benchmarking systems, but operate traditional monitoring systems, like Nagios.","PeriodicalId":267737,"journal":{"name":"2011 IEEE Seventh International Conference on e-Science Workshops","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121875568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Gaming for (Citizen) Science: Exploring Motivation and Data Quality in the Context of Crowdsourced Science through the Design and Evaluation of a Social-Computational System","authors":"Nathan R. Prestopnik, Kevin Crowston","doi":"10.1109/ESCIENCEW.2011.14","DOIUrl":"https://doi.org/10.1109/ESCIENCEW.2011.14","url":null,"abstract":"Citizen Sort, currently under development, is a web-based social-computational system designed to support a citizen science task, the taxonomic classification of various insect, animal, and plant species. In addition to supporting this natural science objective, the Citizen Sort platform will also support information science research goals on motivation for participation in social-computation and citizen science. In particular, this research program addresses the use of games to motivate participation in social-computational citizen science, and explores the effects of system design on motivation and data quality. A design science approach, where IT artifacts are developed to solve problems and answer research questions is described. Research questions, progress on Citizen Sort planning and implementation, and key challenges are discussed.","PeriodicalId":267737,"journal":{"name":"2011 IEEE Seventh International Conference on e-Science Workshops","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132584671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Flexible Database-Centric Platform for Citizen Science Data Capture","authors":"C. Ellul, L. Francis, M. Haklay","doi":"10.1109/ESCIENCEW.2011.15","DOIUrl":"https://doi.org/10.1109/ESCIENCEW.2011.15","url":null,"abstract":"The paper describes a platform developed by the Extreme Citizen Science (ExCiteS) group at University College London over the past five years to facilitate online data capture by Citizen Scientists in the context of community science, where local environmental problems are monitored. Responding to user needs, the platform has been developed to be as flexible as possible in terms of the types of data that can be captured -- these currently include numbers, text, video, photography, pull-down lists, multiple selection lists and so forth. Live data feeds and links to social networking such as twitter have also been incorporated. This platform is database-centric, and thus allows capture and storage of data from multiple devices (currently Web and mobile) in one central location. All map-based data is captured and held in native spatial data format inside the database. To support Citizen Science activity, the system has been designed to allow new projects to be added without the requirement for additional development (programming), and an administration tool developed to support this task. Each project is allocated custom themes depending on the project requirements and a variety of 'skins' can be configured to give the website a different appearance in each case. The platform is currently used by over 20 different groups within the United Kingdom -- though mostly for more social and perceptual data collection, rather than scientific. After demonstrating its use in an urban noise study, it is now adapted to use in air pollution studies. An extension to mobile devices (Android) is also underdevelopment.","PeriodicalId":267737,"journal":{"name":"2011 IEEE Seventh International Conference on e-Science Workshops","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114225097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Wiggins, Greg Newman, R. Stevenson, Kevin Crowston
{"title":"Mechanisms for Data Quality and Validation in Citizen Science","authors":"A. Wiggins, Greg Newman, R. Stevenson, Kevin Crowston","doi":"10.1109/ESCIENCEW.2011.27","DOIUrl":"https://doi.org/10.1109/ESCIENCEW.2011.27","url":null,"abstract":"Data quality is a primary concern for researchers employing a public participation in scientific research (PPSR) or ``citizen science'' approach. This mode of scientific collaboration relies on contributions from a large, often unknown population of volunteers with variable expertise. In a survey of PPSR projects, we found that most projects employ multiple mechanisms to ensure data quality and appropriate levels of validation. We created a framework of 18 mechanisms commonly employed by PPSR projects for ensuring data quality, based on direct experience of the authors and a review of the survey data, noting two categories of sources of error (protocols, participants) and three potential intervention points (before, during and after participation), which can be used to guide project design.","PeriodicalId":267737,"journal":{"name":"2011 IEEE Seventh International Conference on e-Science Workshops","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128757287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Kunszt, L. Malmström, Nicola Fantini, Wibke Sudholt, M. Lautenschlager, Roland Reifler, Stefan Ruckstuhl
{"title":"Accelerating 3D Protein Modeling Using Cloud Computing: Using Rosetta as a Service on the IBM SmartCloud","authors":"P. Kunszt, L. Malmström, Nicola Fantini, Wibke Sudholt, M. Lautenschlager, Roland Reifler, Stefan Ruckstuhl","doi":"10.1109/eScienceW.2011.12","DOIUrl":"https://doi.org/10.1109/eScienceW.2011.12","url":null,"abstract":"Biology as a scientific domain needs a growing amount of computational power. However, not every researcher has access to high performance computing resources locally. Today, it is easy to buy computing resources on demand from public cloud providers like Amazon and IBM, paying only for the amount of computing that is really being used. However, the difficulty of setting up the simulation and operating the virtual infrastructure is also often a showstopper for scientists to use cloud resources. This gap is filled by innovative software as a service providers like the ETH Spin-off company Cloud Broker GmbH, enabling a more direct access to commercial clouds for researchers in life science. Here we report on a joint project between the ETH Zurich, IBM and Cloud Broker to perform a large-scale 3D protein model simulation using the application Rosetta on the new IBM Smart Cloud Enterprise.","PeriodicalId":267737,"journal":{"name":"2011 IEEE Seventh International Conference on e-Science Workshops","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117324780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Eisenhauer, M. Wolf, H. Abbasi, S. Klasky, K. Schwan
{"title":"A Type System for High Performance Communication and Computation","authors":"G. Eisenhauer, M. Wolf, H. Abbasi, S. Klasky, K. Schwan","doi":"10.1109/eScienceW.2011.16","DOIUrl":"https://doi.org/10.1109/eScienceW.2011.16","url":null,"abstract":"The manner in which data is represented, accessed and transmitted has an affect upon the efficiency of any computing system. In the domain of high performance computing, traditional frameworks like MPI have relied upon a relatively static type system with a high degree of a priori knowledge shared among the participants. However, modern scientific computing is increasingly distributed and dynamic, requiring the ability to dynamically create multi-platform workflows, to move processing to data, and to perform both in situ and streaming data analysis. Traditional approaches to data type description and communication in middleware, which typically either require a priori agreement on data types, or resort to highly inefficient representations like XML, are insufficient for the new domain of dynamic science. This paper describes a different approach, using FFS, a middleware library that implements efficient manipulation of application-level data. FFS provides for highly efficient binary data communication, XML-like examination of unknown data, and both third-party and in situ data processing via dynamic code generation. All of these capabilities are fully dynamic at run-time, without requiring a priori agreements or knowledge of the exact form of the data being communicated or analyzed.","PeriodicalId":267737,"journal":{"name":"2011 IEEE Seventh International Conference on e-Science Workshops","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126192155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Taxonomy of Multiscale Computing Communities","authors":"D. Groen, S. Zasada, P. Coveney","doi":"10.1109/eScienceW.2011.11","DOIUrl":"https://doi.org/10.1109/eScienceW.2011.11","url":null,"abstract":"We present a concise and comprehensive review of research communities which perform multiscale computing. We provide an overview of communities in a range of domains, and compare these communities to assess the level of use of multiscale methods in different research domains. Additionally, we characterize several areas where inter-disciplinary multiscale collaboration or the introduction of common and reusable methods could be particularly beneficial. We conclude that multiscale computing has become increasingly popular in recent years, that different communities adopt radically different organizational approaches, and that simulations on a length scale of a few metres and a time scale of a few hours can be found in many of the multiscale research domains. Sharing multiscale methods specifically geared towards these scales between communities may therefore be particularly beneficial.","PeriodicalId":267737,"journal":{"name":"2011 IEEE Seventh International Conference on e-Science Workshops","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116313540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}