2012 IEEE 8th International Conference on E-Science最新文献

筛选
英文 中文
Digitization and search: A non-traditional use of HPC 数字化与搜索:高性能计算的非传统应用
2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/eScience.2012.6404445
Liana Diesendruck, Luigi Marini, R. Kooper, M. Kejriwal, Kenton McHenry
{"title":"Digitization and search: A non-traditional use of HPC","authors":"Liana Diesendruck, Luigi Marini, R. Kooper, M. Kejriwal, Kenton McHenry","doi":"10.1109/eScience.2012.6404445","DOIUrl":"https://doi.org/10.1109/eScience.2012.6404445","url":null,"abstract":"Automated search of handwritten content is a highly interesting and applicative subject, especially important today due to the public availability of large digitized document collections. We describe our efforts with the National Archives (NARA) to provide searchable access to the 1940 Census data and discuss the HPC resources needed to implement the suggested framework. Instead of trying to recognize the handwritten text, a still very difficult task, we use a content based image retrieval technique known as Word Spotting. Through this paradigm, the system is queried by the use of handwritten text images instead of ASCII text and ranked groups of similar looking images are presented to the user. A significant amount of computing power is needed to accomplish the pre-processing of the data so to make this search capability available on an archive. The required preprocessing steps and the open source framework developed are discussed focusing specifically on HPC considerations that are relevant when preparing to provide searchable access to sizeable collections, such as the US Census. Having processed the state of North Carolina from the 1930 Census using 98,000 SUs we estimate the processing of the entire country for 1940 could require up to 2.5 million SUs. The proposed framework can be used to provide an alternative to costly manual transcriptions for a variety of digitized paper archives.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"113 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80601556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
eResearch environment for remote instrumentation: VBL, RLI, VisLabl & 2 远程仪器研究环境:VBL, RLI, VisLabl & 2
2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/eScience.2012.6404465
C. Myers, Michael D'Silva
{"title":"eResearch environment for remote instrumentation: VBL, RLI, VisLabl & 2","authors":"C. Myers, Michael D'Silva","doi":"10.1109/eScience.2012.6404465","DOIUrl":"https://doi.org/10.1109/eScience.2012.6404465","url":null,"abstract":"This talk demonstrates the current remote experimentation capabilities deployed at the Australian Synchrotron and La Trobe university, as well as remote data transfer services deployed at the above locations and at Bragg, ansto, metadata extraction tool, MyTardis node's, remote analysis and visualisation environments for medical imaging and IR spectroscopy and the use of high resolution multi screen displays.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"71 1","pages":"1-2"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90424526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Partial replica selection for spatial datasets 空间数据集的部分副本选择
2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/eScience.2012.6404473
Yun Tian, P. J. Rhodes
{"title":"Partial replica selection for spatial datasets","authors":"Yun Tian, P. J. Rhodes","doi":"10.1109/eScience.2012.6404473","DOIUrl":"https://doi.org/10.1109/eScience.2012.6404473","url":null,"abstract":"The implementation of partial or incomplete replicas, which represent only a subset of a larger dataset, has been an active topic of research. Partial Spatial Replicas extend this functionality to spatial data, allowing us to distribute a spatial dataset in pieces over several locations. Accessing only a subset of a spatial replica usually results in a large number of relatively small read requests made to the underlying storage device. For this reason, an accurate model of disk access is important when working with spatial subsets. We make two primary contributions in this paper. First, we describe a model for disk access performance that takes filesystem prefetching into account and is sufficiently accurate for spatial replica selection. Second, making a few simplifying assumptions, we propose a fast replica selection algorithm for partial spatial replicas. The algorithm uses a greedy approach that attempts to maximize performance by choosing a collection of replica subsets that allow fast data retrieval by a client machine. Experiments show that the performance of the solution found by our algorithm is on average always at least 91% and 93.4% of the performance of the optimal solution in 4-node and 8-node tests respectively.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"59 1 1","pages":"1-10"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89349493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A system for management of Computational Fluid Dynamics simulations for civil engineering 土木工程计算流体力学模拟管理系统
2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-08 DOI: 10.1109/eScience.2012.6404433
Peter Sempolinski, D. Thain, Daniel Wei, A. Kareem
{"title":"A system for management of Computational Fluid Dynamics simulations for civil engineering","authors":"Peter Sempolinski, D. Thain, Daniel Wei, A. Kareem","doi":"10.1109/eScience.2012.6404433","DOIUrl":"https://doi.org/10.1109/eScience.2012.6404433","url":null,"abstract":"We introduce a web-based system for management of Computational Fluid Dynamics(CFD) simulations. This system provides an interface for users, on a web-browser, to have an intuitive, user-friendly means of dispatching and controlling long-running simulations. CFD presents a challenge to its users due to the complexity of its internal mathematics, the high computational demands of its simulations and the complexity of inputs to its simulations and related tasks. We designed this system to be as extensible as possible in order to be suitable for many different civil engineering applications. The front-end of this system is a webserver, which provides the user interface. The back-end is responsible for starting and stopping jobs as requested. There are also numerous components specifically for facilitating CFD computation. We discuss our experience with presenting this system to real users and the future ambitions for this project.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"29 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89687975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Temporal representation for scientific data provenance 科学数据来源的时态表示
2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-01 DOI: 10.1109/eScience.2012.6404477
Peng Chen, Beth Plale, M. Aktaş
{"title":"Temporal representation for scientific data provenance","authors":"Peng Chen, Beth Plale, M. Aktaş","doi":"10.1109/eScience.2012.6404477","DOIUrl":"https://doi.org/10.1109/eScience.2012.6404477","url":null,"abstract":"Provenance of digital scientific data is an important piece of the metadata of a data object. It can however grow voluminous quickly because the granularity level of capture can be high. It can also be quite feature rich. We propose a representation of the provenance data based on logical time that reduces the feature space. Creating time and frequency domain representations of the provenance, we apply clustering, classification and association rule mining to the abstract representations to determine the usefulness of the temporal representation. We evaluate the temporal representation using an existing 10 GB database of provenance captured from a range of scientific workflows.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"13 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87441105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
A data-driven urban research environment for Australia 数据驱动的澳大利亚城市研究环境
2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-01 DOI: 10.1109/eScience.2012.6404481
R. Sinnott, Christopher Bayliss, G. Galang, Phillip Greenwood, George Koetsier, D. Mannix, L. Morandini, Marcos Nino-Ruiz, C. Pettit, Martin Tomko, M. Sarwar, R. Stimson, W. Voorsluys, I. Widjaja
{"title":"A data-driven urban research environment for Australia","authors":"R. Sinnott, Christopher Bayliss, G. Galang, Phillip Greenwood, George Koetsier, D. Mannix, L. Morandini, Marcos Nino-Ruiz, C. Pettit, Martin Tomko, M. Sarwar, R. Stimson, W. Voorsluys, I. Widjaja","doi":"10.1109/eScience.2012.6404481","DOIUrl":"https://doi.org/10.1109/eScience.2012.6404481","url":null,"abstract":"The Australian Urban Research Infrastructure Network (AURIN) project (www.aurin.org.au) is tasked with developing an e-Infrastructure to support urban and built environment research across Australia. As identified in [1], this e-Infrastructure must provide seamless access to highly distributed and heterogeneous data sets from multiple organisations with accompanying analytical and visualization capabilities. The project is tasked with delivering a secure, web-based unifying environment offering a one-stop-shop for Australia-wide urban and built environment research. This paper describes the architectural design and implementation of the AURIN data-driven e-Infrastructure, where data is not just a passive entity that is accessed and used as a consequence of research demand, but is instead, directly shaping the computational access, processing and intelligent utilization possibilities. This is demonstrated in a situational context.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"13 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82101246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
High-performance computing without commitment: SC2IT: A cloud computing interface that makes computational science available to non-specialists 无需承诺的高性能计算:SC2IT:使非专业人员可以使用计算科学的云计算接口
2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-01 DOI: 10.1109/eScience.2012.6404441
K. Jorissen, W. Johnson, F. Vila, J. Rehr
{"title":"High-performance computing without commitment: SC2IT: A cloud computing interface that makes computational science available to non-specialists","authors":"K. Jorissen, W. Johnson, F. Vila, J. Rehr","doi":"10.1109/eScience.2012.6404441","DOIUrl":"https://doi.org/10.1109/eScience.2012.6404441","url":null,"abstract":"Computational work is a vital part of many scientific studies. In materials science research in particular, theoretical models are often needed to understand measurements. There is currently a double barrier that keeps a broad class of researchers from using state-of-the-art materials science codes: the software typically lacks user-friendliness, and the hardware requirements can demand a significant investment, e.g. the purchase of a Beowulf cluster. Scientific Cloud Computing has the potential to remove this barrier and make computational science accessible to a wider class of scientists who are not computational specialists. We present a set of interface tools, SC2IT, that enables seamless control of virtual compute clusters in the Amazon EC2 cloud and is designed to be embedded in user-friendly Java GUIs. We present applications of our Scientific Cloud Computing method to the materials science codes FEFF9, WIEN2k, and MEEP-mpi. SC2IT and the paradigm described here are applicable to other fields of research outside materials science within current Cloud Computing capability.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"22 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80195839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Discovering drug targets for neglected diseases using a pharmacophylogenomic cloud workflow 使用药理学云工作流发现被忽视疾病的药物靶点
2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-01 DOI: 10.1109/eScience.2012.6404431
Kary A. C. S. Ocaña, Daniel de Oliveira, Jonas Dias, Eduardo S. Ogasawara, M. Mattoso
{"title":"Discovering drug targets for neglected diseases using a pharmacophylogenomic cloud workflow","authors":"Kary A. C. S. Ocaña, Daniel de Oliveira, Jonas Dias, Eduardo S. Ogasawara, M. Mattoso","doi":"10.1109/eScience.2012.6404431","DOIUrl":"https://doi.org/10.1109/eScience.2012.6404431","url":null,"abstract":"Illnesses caused by parasitic protozoan are a research priority. A representative group of these illnesses is the commonly known as Neglected Tropical Diseases (NTD). NTD specially attack low socioeconomic population around the world and new anti-protozoan inhibitors are needed and several drug discovery projects focus on researching new drug targets. Pharmacophylogenomics is a novel bioinformatics field that aims at reducing the time and the financial cost of the drug discovery process. Pharmacophylogenomic analyses are applied mainly in the early stages of the research phase in drug discovery. Pharmacophylogenomic analysis executes several bioinformatics programs in a coherent flow to identify homologues sequences, construct phylogenetic trees and execute evolutionary and structural experiments. This way, it can be modeled as scientific workflows. Pharmacophylogenomic analysis workflows are complex, computing and data intensive and may execute during weeks. This way, it benefits from parallel execution. We propose SciPPGx, a scientific workflow that aims at providing thorough inferring support for pharmacophylogenomic hypotheses. SciPPGx is executed in parallel in a cloud using SciCumulus workflow engine. Experiments show that SciPPGx considerably reduces the total execution time up to 97.1% when compared to a sequential execution. We also present representative biological results taking advantage of the inference covering several related bioinformatics overviews.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"3 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74944990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
BIGS: A framework for large-scale image processing and analysis over distributed and heterogeneous computing resources BIGS:用于在分布式和异构计算资源上进行大规模图像处理和分析的框架
2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-01 DOI: 10.1109/eScience.2012.6404424
R. Ramos-Pollán, F. González, Juan C. Caicedo, Angel Cruz-Roa, Jorge E. Camargo, Jorge A. Vanegas, Santiago A. Pérez-Rubiano, J. Bermeo, Juan Sebastian Otálora Montenegro, Paola K. Rozo, John Arevalo
{"title":"BIGS: A framework for large-scale image processing and analysis over distributed and heterogeneous computing resources","authors":"R. Ramos-Pollán, F. González, Juan C. Caicedo, Angel Cruz-Roa, Jorge E. Camargo, Jorge A. Vanegas, Santiago A. Pérez-Rubiano, J. Bermeo, Juan Sebastian Otálora Montenegro, Paola K. Rozo, John Arevalo","doi":"10.1109/eScience.2012.6404424","DOIUrl":"https://doi.org/10.1109/eScience.2012.6404424","url":null,"abstract":"This paper presents BIGS the Big Image Data Analysis Toolkit, a software framework for large scale image processing and analysis over heterogeneous computing resources, such as those available in clouds, grids, computer clusters or throughout scattered computer resources (desktops, labs) in an opportunistic manner. Through BIGS, eScience for image processing and analysis is conceived to exploit coarse grained parallelism based on data partitioning and parameter sweeps, avoiding the need of inter-process communication and, therefore, enabling loosely coupled computing nodes (BIGS workers). It adopts an uncommitted resource allocation model where (1) experimenters define their image processing pipelines in a simple configuration file, (2) a schedule of jobs is generated and (3) workers, as they become available, take over pending jobs as long as their dependency on other jobs is fulfilled. BIGS workers act autonomously, querying the job schedule to determine which one to take over. This removes the need for a central scheduling node, requiring only access by all workers to a shared information source. Furthermore, BIGS workers are encapsulated within different technologies to enable their agile deployment over the available computing resources. Currently they can be launched through the Amazon EC2 service over their cloud resources, through Java Web Start from any desktop computer and through regular scripting or SSH commands. This suits well different kinds of research environments, both when accessing dedicated computing clusters or clouds with committed computing capacity or when using opportunistic computing resources whose access is seldom or cannot be provisioned in advance. We also adopt a NoSQL storage model to ensure the scalability of the shared information sources required by all workers, including within BIGS support for HBase and Amazon's DynamoDB service. Overall, BIGS now enables researchers to run large scale image processing pipelines in an easy, affordable and unplanned manner with the capability to take over computing resources as they become available at run time. This is shown in this paper by using BIGS in different experimental setups in the Amazon cloud and in an opportunistic manner, demonstrating its configurability, adaptability and scalability capabilities.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"69 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75651632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
IRMIS: The care and feeding of a generalized relatively relational database for accelerator components with a connection to the real time EPICS Input output controllers IRMIS:与实时EPICS输入输出控制器连接的加速器组件的通用相对关系数据库的维护和馈送
2012 IEEE 8th International Conference on E-Science Pub Date : 2012-10-01 DOI: 10.1109/eScience.2012.6404469
R. Farnsworth, S. Benes
{"title":"IRMIS: The care and feeding of a generalized relatively relational database for accelerator components with a connection to the real time EPICS Input output controllers","authors":"R. Farnsworth, S. Benes","doi":"10.1109/eScience.2012.6404469","DOIUrl":"https://doi.org/10.1109/eScience.2012.6404469","url":null,"abstract":"IRMIS: The care and feeding of a generalized relatively relational database for accelerator components with a connection to the real time EPICS Input output controllers. This paper describes a relational database approach to documenting and maintaining; the feeding. It describes the automated process used to generate accelerator or synchrotron component data for the relational tables and the role of devices and components. The data this obtained turn may be used or presented in a variety of ways to the end use in order to either optimize the maintenance or to provide machine metadata for experimental performance purposes.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":"32 1","pages":"1-3"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83187892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信