Jiuqiang Chen, Ruisheng Zhang, Shilin Chen, Lian Li, Y. Zhang, Chengda Yuan, Lifen Li
{"title":"A Data Management System for Pre-docking in Large-Scale Virtual Screening","authors":"Jiuqiang Chen, Ruisheng Zhang, Shilin Chen, Lian Li, Y. Zhang, Chengda Yuan, Lifen Li","doi":"10.1109/ChinaGrid.2010.40","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2010.40","url":null,"abstract":"Virtual screening is a new approach attracting increasing levels of interest in the pharmaceutical industry, as a productive and cost-effective technology in the search for novel lead compounds. The preparation of millions of small molecular compounds is the prerequisite for large-scale virtual screening, and these massive data are usually provided with different format. In addition, scientists often need to select some of them that meet certain conditions. Therefore, an efficient data management approach is playing an important role in virtual screening process for managing large-scale small molecular compounds. In this paper, we represent a comprehensive data management framework for pre-docking in large-scale virtual screening. In this framework, we construct a distributed chemical database and utilize parallel processing approach to search certain molecules from the database on the scale of at least several million. We also develop a proxy schema, which is responsible to perform the basic function (such as, splitting large-scale data, update, insert and so on) a collection of multiple, logically interrelated databases distributed over a computer network, meanwhile, we design and establish a rule of splitting large-scale data with optimization. Finally, we simulate and demonstrate a stress test of constructing and searching database. It turns out that our proposal could make the preparing phase of virtual screening process more simple and efficient.","PeriodicalId":429657,"journal":{"name":"2010 Fifth Annual ChinaGrid Conference","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124267763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integration of Task Scheduling with Replica Placement in Data Grid for Limited Disk Space of Resources","authors":"Kan Yi, Feng Ding, Heng Wang","doi":"10.1109/ChinaGrid.2010.29","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2010.29","url":null,"abstract":"Data grid integrates geographically distributed resources for solving data-sensitive scientific applications. As tasks are sensitive to data, dealing with large amount of data makes the requirement for efficiency in data access more critical. The goal of replica placement is to shorten data access time for enhancing the task execution performance. Therefore, replica placement strategies are often integral to task scheduling algorithms. However, all existing integration strategies make an assumption that the disk space of resources in data grid is unlimited. In this paper, we extended MinMin heuristic to cater to the situation where the disk space of a computational resource is limited. In addition, a heuristic replica placement algorithm is proposed, in which the limited disk space of a storage resource is considered as well. Another character of this heuristic replica placement algorithm is that it can map more than one hot file to several storage resources. We study our approach and evaluate it through simulation. The result shows that the integration of the two algorithms has improved the performance of data grid especially when the whole disk space of storage resources is relatively smaller than the amount of all data files.","PeriodicalId":429657,"journal":{"name":"2010 Fifth Annual ChinaGrid Conference","volume":"17 22","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113979450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fan Ding, Ruisheng Zhang, Keyin Ruan, Jiazao Lin, Zhili Zhao
{"title":"A QoS-Based Scheduling Approach for Complex Workflow Applications","authors":"Fan Ding, Ruisheng Zhang, Keyin Ruan, Jiazao Lin, Zhili Zhao","doi":"10.1109/ChinaGrid.2010.42","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2010.42","url":null,"abstract":"In dynamic and heterogeneous Grid environments, the majority of the scientific applications require to be expressed as complex workflows which assemble multiple Web Services to implement complex scientific tasks. In addition, most workflow management systems are bound to concrete services distributed in different physical domains or concrete environments. Therefore, scientists still need to discover resources manually and schedule the jobs directly onto the Grid, and it could not meet users’ requirement of scheduling workflow with large scale services. In this paper, we provide a comprehensive QoS (Quality of Service) model to support the possibility that a service instance is capable of offering to satisfy the user’s requirements. Further we propose a complex workflow scheduling approach using dynamic programming, which focus on how to select a global optimal path for workflow scheduling based on QoS, and addressing services binding problem. At last, we demonstrate the efficiency and accuracy of the proposed approach by simulating experiments in heterogeneous and dynamic grid environment.","PeriodicalId":429657,"journal":{"name":"2010 Fifth Annual ChinaGrid Conference","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130194479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MISM: Modelica-Based Interactive Scheduling Mechanism for Virtual Educational Experiments","authors":"Wenbin Jiang, Zhengfei Tang, Hai Jin, Chao Liu","doi":"10.1109/ChinaGrid.2010.55","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2010.55","url":null,"abstract":"Different virtual educational experiments (VEEs) have different frameworks and implementations in different disciplines. How to integrate different VEEs into one unified supporting platform is a challenging problem. The key problem of this integration is how to build a unified mathematical model for VEEs of diversified disciplines. However, most existing multi-domain uniform modeling (MDUM) approaches such as Modelica are research-oriented and lack of real time interaction, which does not meet the requirement of virtual experiments that are user-oriented and highly interactive. To tackle above problem, a Modelica-based interactive scheduling mechanism (MISM) for VEE is proposed. In MISM, user-oriented interaction is based on a best step length algorithm and an interactive event step algorithm. The throughput is considered in the former and the real time interaction is resolved by the latter. Experimental results shows that the approaches presented can meet the requirements of VEEs efficiently.","PeriodicalId":429657,"journal":{"name":"2010 Fifth Annual ChinaGrid Conference","volume":"27 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133383682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A NETCONF-Based Distributed Network Management Design Using Web Services and P2P","authors":"Yanan Chang, Limiao Chen, Baozhen Wu","doi":"10.1109/ChinaGrid.2010.48","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2010.48","url":null,"abstract":"The Network Configuration Protocol (NETCONF) is proposed as a new solution for efficient configuration management of heterogeneous network devices. So the NETCONF-based distributed network management becomes a hot topic. In this paper, Web services and Peer-to-Peer (P2P) technologies are introduced in distributed network management to solve the problems of data collection, data sharing and locating resources of NETCONF-based network management system. This paper aims to design a NETCONF-based distributed network management architecture using Web services and P2P technology, which including the communication between domain managers or between domain manager and agents.","PeriodicalId":429657,"journal":{"name":"2010 Fifth Annual ChinaGrid Conference","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133640582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving Range Query Performance on Historic Web Page Data","authors":"Geng Li, Bo Peng","doi":"10.1109/ChinaGrid.2010.28","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2010.28","url":null,"abstract":"This paper is about the performance of range queries on historic web page data set, i.e. requests into a data set of web pages that keeps record of historic versions of HTML data of URLs on the web for a subset of data, the URLs and the timestamps of which satisfy the query conditions. To keep track of all versions of every web URL, the data set could easily scale up to terabytes. Hence, systems providing query services to such a data set would require much computing resource. We show that in this scenario data storage layout has significant impact on query performance and propose storage design principles for performance improvement through quantitative approaches.","PeriodicalId":429657,"journal":{"name":"2010 Fifth Annual ChinaGrid Conference","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115779863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Law Text Clustering Based on Referential Relations","authors":"Biao Fan, Tao Liu, H. Hu, Xiaoyong Du","doi":"10.1109/ChinaGrid.2010.22","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2010.22","url":null,"abstract":"This paper proposes a new method to cluster law texts based on referential relation of laws. We extract law entities (an entity represents a law) and their referential relation from law texts. Then SimRank algorithm is applied to calculate law entity’s similarity through referential relation and law clustering is carried out based on the SimRank similarity. This is the first time to apply SimRank algorithm in the domain of Law and use it to carry out text clustering. Prototype and experiments show that our solution is feasible. We also publish the extracted data as Linked Law Data with RDF data model, which forms the first open semantic web database in Law domain. Linked Law Data enables user to access law data with rich data links and query web data by application interface of Semantic Web.","PeriodicalId":429657,"journal":{"name":"2010 Fifth Annual ChinaGrid Conference","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123189856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Framework of Bug Reporting System Based on Keywords Extraction and Auction Algorithm","authors":"Shenglong Tan, Shenghong Hu, Lihong Chen","doi":"10.1109/ChinaGrid.2010.13","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2010.13","url":null,"abstract":"Bug report is becoming a main approach to improve and perfect a complicated piece of software, but analyzing the bug report is a tedious and time-consuming job, especially when the number of bug reports is huge and many duplicate reports are mixed within the incoming bug reports. In this paper, we present a framework to automatically triage and detect the duplicate bug reports by keywords extraction and combine those existing relative reports to form a more integrated and complete bug report, and then assign the report to appropriate developer based on auction rules and developer’s experiences. After having fixed the bug, the developer can submit some new keywords related to the fixed bug to the system, in which we can complement the keywords repository.","PeriodicalId":429657,"journal":{"name":"2010 Fifth Annual ChinaGrid Conference","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123034095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluation of a Performance Model of Lustre File System","authors":"Tiezhu Zhao, Verdi March, Shoubin Dong, S. See","doi":"10.1109/ChinaGrid.2010.38","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2010.38","url":null,"abstract":"As a large-scale global parallel file system, Lustre file system plays a key role in High Performance Computing (HPC) system, and the potential performance of such systems can be difficult to predict because the potential impact to application performance is not clearly understood. It is important to gain insights into the deliverable Lustre file system IO efficiency. In order to gain a good understanding on what and how to impact the performance of Lustre file system. This paper presents a study on performance evaluation of Lustre file systems and we propose a novel relative performance model to predict overhead under different performance factors. In our previous experiments, we discover that different performance factors have a closed correlation. In order to mining the correlations, we introduce relative performance model to predict performance differences between a pair of Lustre file system equipped with different performance factors. On average, relative model can predict bandwidth within 17%-28%. The results show our relative prediction model can obtain better prediction accuracy.","PeriodicalId":429657,"journal":{"name":"2010 Fifth Annual ChinaGrid Conference","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126039539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Security Challenges on the Clone, Snapshot, Migration and Rollback of Xen Based Computing Environments","authors":"Lei Yu, Chuliang Weng, Minglu Li, Yuan Luo","doi":"10.1109/ChinaGrid.2010.47","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2010.47","url":null,"abstract":"While virtual machines provide significant flexibility for users and administrators to clone, snapshot, migration and rollback with unprecedented ease, it also bring forth some new problems and negative effects to the security of computing environments. The applications and operating systems are forced to run in a dynamical and unregulated computing environment, which gives rise to so radical difference that the administrator is difficult to maintain the security of computing environment. This paper summarizes and presents some types of security challenges based on existing viewpoints, then we analysis the similar challenges in Xen and discuss the potential directions and implementations for modifying it to adapt to these challenges.","PeriodicalId":429657,"journal":{"name":"2010 Fifth Annual ChinaGrid Conference","volume":"184 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120883437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}