Yan Zhang, Ruisheng Zhang, Qiuqiang Chen, Xiaopan Gao, Rongjing Hu, Y. Zhang, Guangcai Liu
{"title":"A Hadoop-based Massive Molecular Data Storage Solution for Virtual Screening","authors":"Yan Zhang, Ruisheng Zhang, Qiuqiang Chen, Xiaopan Gao, Rongjing Hu, Y. Zhang, Guangcai Liu","doi":"10.1109/ChinaGrid.2012.26","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2012.26","url":null,"abstract":"Virtual Screening involves massive computing tasks with millions of molecules docking on the targeted protein. Such data-intensive science always faces the challenge of managing tens of TB datasets, which gives rise to the requirement of large-scale storage. Furthermore, the efficient query and transmission of the large-scale datasets are the other key requirements during the virtual screening progress. Therefore, in this data-intensive application, a massive data storage solution is expected to improve the efficiency of storage and access of large-scale molecules and their docking results, as well as facilitating the data preparing and analysis phases of virtual screening. In order to address the key requirements mentioned above, we proposed a novel storage solution based on Hadoop for virtual screening. HBase was implemented as a distributed database to persist the properties of massive molecules and docking results. HDFS was utilized as a molecule source files storage system. The comparison of the system performance was also presented. Finally, we concluded that the storage solution we proposed could be considered as an alternative attempt to enable the efficient storage and access of large-scale molecules and docking results in virtual screening research.","PeriodicalId":371382,"journal":{"name":"2012 Seventh ChinaGrid Annual Conference","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127794597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhili Zhao, Lian Li, Ruisheng Zhang, A. Paschke, Jiazao Lin
{"title":"Integrating Heterogeneous Grid Middleware to Support Large-Scale Bag-of-Tasks Applications","authors":"Zhili Zhao, Lian Li, Ruisheng Zhang, A. Paschke, Jiazao Lin","doi":"10.1109/ChinaGrid.2012.29","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2012.29","url":null,"abstract":"Bag-of-Tasks (BoTs) applications are loosely coupled parallel applications whose concurrent entities are independent of each other. In this paper, we propose a novel architecture, which integrates distributed resources across heterogeneous Grid systems to support large-scale BoT applications. We present our integration solution and introduce how our prototype is implemented. Finally, we demonstrate our solution with a use case from the domain of drug discovery. It turns out that our proposal not only provides an adequate tool to support the development of BoT applications, but also is coherent and operationally reliable.","PeriodicalId":371382,"journal":{"name":"2012 Seventh ChinaGrid Annual Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130522498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiang Gao, Qinghua Chen, Yurong Chen, Qingwei Sun, Yan Liu, Mingzhu Li
{"title":"A Dispatching-Rule-Based Task Scheduling Policy for MapReduce with Multi-type Jobs in Heterogeneous Environments","authors":"Xiang Gao, Qinghua Chen, Yurong Chen, Qingwei Sun, Yan Liu, Mingzhu Li","doi":"10.1109/ChinaGrid.2012.27","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2012.27","url":null,"abstract":"MapReduce has emerged as an important and widely used programming model for distributed and parallel computing, due to its ease of use, generality and scalability. This model is proposed to mainly solve large-scale data processing, i.e. data-intensive jobs, and it is optimized for homogenous environment, in which computing nodes are identical and dedicated. Today enterprise IT systems preserve massive, historical management and operational data, which need both data-intensive and computation-intensive analysis while using heterogeneous computing resources. In order to support enterprise data analysis application with the MapReduce model, it is important to improve MapReduce's task scheduling algorithm that can reduce the overall completion time with multi-type jobs and in heterogeneous environments. This paper formulates the scheduling problem as an optimization problem. Based on the job shop scheduling theory and existing approximation algorithms, we propose a new dispatching-rule-based and online scheduling policy LPT-θ. By using LPT-θ, the tasks with larger processing time and within a θ-space would be assigned with higher priorities. Numerical results show that LPT-θ can achieve a 12%~45% performance gain compared with the original scheduling algorithm in MapReduce.","PeriodicalId":371382,"journal":{"name":"2012 Seventh ChinaGrid Annual Conference","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115123486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Construct SaaS Applications from Multi-abstract-level: Method and System","authors":"Lei Wu, Ying Pan, Shijun Liu, Qian Li","doi":"10.1109/ChinaGrid.2012.12","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2012.12","url":null,"abstract":"With the emergence of SaaS, enterprises are looking for available application construction approach to quickly adopt to new business requirements and reduce application development and maintenance costs. To speed the application construction and guarantee multi-tenant configurability in large amount, a flexible design is of major importance. This paper puts forward the concept of abstract SaaS application and proposes a fast and flexible multi-abstract-level construction approach for application designers based on service-oriented architecture. The approach abstracts the construction process into three levels: abstract SaaS application level construction, composition service level construction and service component level construction. According to business requirements and SLA constraints of multi-tenant applications, the designer first constructs the tenant application on one level and produces a running application which satisfies the tenants' SLA (Service-Level Agreement) constraints.","PeriodicalId":371382,"journal":{"name":"2012 Seventh ChinaGrid Annual Conference","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114151304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tao Gao, Yanjun Xu, Xiaoying Wang, Jinlei Jiang, Yongwei Wu
{"title":"EasyDeploy: Automatic Application Deployment in Virtual Clusters","authors":"Tao Gao, Yanjun Xu, Xiaoying Wang, Jinlei Jiang, Yongwei Wu","doi":"10.1109/ChinaGrid.2012.28","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2012.28","url":null,"abstract":"Along with the fast development of Cloud computing, it has become a trend to use virtual clusters for scientific and business works. In spite of the fact, it is a big challenge to set up a virtual cluster to meet the user-specific requirement such as the applications to be used. In this paper we design and implement Easy Deploy, a system that can set up virtual clusters with user-specifying applications in Cloud computing environment automatically. Easy Deploy realizes its own automatic application deployment method in virtual clusters without the help of external tools for traditional clusters. It decouples application packages away from virtual machine images to save storage space. To reduce application package transfer time, cache and prefetching mechanism is provided. The experimental results show that in our settings we can create an eighteen nodes virtual cluster with Hadoop environment in less than 50 seconds. The cache and prefetching mechanism we designed can do reduce the transfer time of application packages. When we use both of them to create a virtual cluster, the transfer time will reduce by three times than that in the case without any optimization strategy.","PeriodicalId":371382,"journal":{"name":"2012 Seventh ChinaGrid Annual Conference","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114797875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Load Balancing Routing for Wireless Sensor Network in 2D Mesh","authors":"Yipiao Chen, Yi Yang, Yubo Deng, Lian Li","doi":"10.1109/ChinaGrid.2012.16","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2012.16","url":null,"abstract":"In this article, we deal with the load balancing routing problem for wireless sensor network (WSN) in two-dimensional mesh. Load balancing routing is very important for the WSN. However the existing balancing routing algorithms mostly focus on the energy balance and most of them are based on clustering technology. It needs time and energy to construct clusters for clustering technology. We propose a load balancing routing algorithm in 2D mesh WSN. The 2D mesh topology is popular due to its simple structure. It has no need to construct clusters and cluster heads. We deploy the base station at the four corners of 2D mesh WSN, and let the base stations bear the main loads. Our routing algorithm significantly reduces the center nodes' loads and has a very perfect load balance in the centre networks which greatly increases the network lifetime.","PeriodicalId":371382,"journal":{"name":"2012 Seventh ChinaGrid Annual Conference","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122318212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Translating Chemical Scripting Languages to Unified Job-Description Language on Chemical-Grid","authors":"Min Zhang, Ruisheng Zhang, Jiajun Xie, Shuping Li, Rongjing Hu, Jingfei Hou, Shuyi Zhang","doi":"10.1109/ChinaGrid.2012.30","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2012.30","url":null,"abstract":"Unified Job-Description Language on Chemical-Grid (UDLC) is a domain-specific language designed to write full-function chemical jobs on Chemical-Grid quickly by providing a common language. It frees chemists to focus on problem-solving rather than on the details of various chemical scripting languages (CSLs) and the target runtime platform and associated middleware. UDLC jobs can directly invoke grid services without dealing with interaction with grid, such as job submission, job monitoring. Thus, it is interesting and meaningful for chemists to translate scripts in other CSLs to UDLC. The system architecture of the translators is presented, translation strategies and several key issues are discussed, and two specific translators are described in detail. The translators translating the scripts, especially a large number of well-tested scripts, to UDLC can help chemists carry out their research based on grid platforms.","PeriodicalId":371382,"journal":{"name":"2012 Seventh ChinaGrid Annual Conference","volume":"381 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123503854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ontology Based Data Conversion from Spreadsheet to OWL","authors":"Xiaohui Zhang, Ruihua Di, Xiaochen Feng","doi":"10.1109/ChinaGrid.2012.17","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2012.17","url":null,"abstract":"At present, there are a large number of data stored in spreadsheets. With the development of semantic web and ontology technology, importing the spreadsheet data into semantic web is helpful for knowledge sharing and acquisition. Existing conversion methods always convert the spreadsheet data into RDF, which cannot be integrated into constructed ontology base. This paper proposed an open architecture called Anyt2OWL for the conversion from various traditional formats to OWL based on ontology schema. Following Any2OWL, the method of converting spreadsheet to OWL which can be imported into ontology base is implemented. In addition, a declarative language is designed to express the mappings between spreadsheet and ontology.","PeriodicalId":371382,"journal":{"name":"2012 Seventh ChinaGrid Annual Conference","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128109574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}