Guigang Zhang, C. Li, Yong Zhang, Chunxiao Xing, Jijiang Yang
{"title":"An Efficient Massive Data Processing Model in the Cloud -- A Preliminary Report","authors":"Guigang Zhang, C. Li, Yong Zhang, Chunxiao Xing, Jijiang Yang","doi":"10.1109/ChinaGrid.2012.21","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2012.21","url":null,"abstract":"Nowadays, the data-intensive applications and IoT applications have gained a very big development in the cloud environment. All these applications need to process massive data. How to process these massive data effectively is becoming very important in the cloud environment. In this paper, we designed an efficient massive data processing model in the cloud. This model can be used to process all kinds of structured data resources, semi-structured data resources and non-structured data resources together. We introduced some key components in this model and give some key algorithms in our model.","PeriodicalId":371382,"journal":{"name":"2012 Seventh ChinaGrid Annual Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131351298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Energy-Aware Genetic Algorithms for Task Scheduling in Cloud Computing","authors":"Ying Chang-tian, Yu Jiong","doi":"10.1109/CHINAGRID.2012.15","DOIUrl":"https://doi.org/10.1109/CHINAGRID.2012.15","url":null,"abstract":"For the cloud computing, task scheduling problems are of paramount importance. It becomes more challenging when takes into account energy consumption, traditional make span criteria and users QoS as objectives. This paper considers independent tasks scheduling in cloud computing as a bi-objective minimization problem with make span and energy consumption as the scheduling criteria. We use Dynamic Voltage Scaling (DVS) to minimize energy consumption and propose two algorithms. These two algorithms use the methods of unify and double fitness to define the fitness function and select individuals. They adopt the genetic algorithm to parallel find the reasonable scheduling scheme. The simulation results demonstrate the two algorithms can efficiently find the right compromise between make span and energy consumption.","PeriodicalId":371382,"journal":{"name":"2012 Seventh ChinaGrid Annual Conference","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114302298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Consistent Backup Mechanism for Disaster Recovery that Using Container Based Virtualization","authors":"Yida Xu, Hongliang Yu, Weimin Zheng","doi":"10.1109/ChinaGrid.2012.10","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2012.10","url":null,"abstract":"Today's businesses have relied much on system backup and disaster recovery, of which there are more and more products that based on various virtualization platforms, as the virtualization technologies kept developing. Container based virtualization, which is a kind of high efficiency virtualization technology, has great potential to support building more flexible backup and disaster recovery system on its platform. In this paper, we introduce a consistent backup mechanism for disaster recovery that using container based virtualization. First, we proposed the concept of consistent checkpoint which contains both memory and disk image at the backup point in time. Then, for backing up disk image, we use incremental backup method and a two-step aggressive backup process, in which production system and data backup can run at the same time, to deal with it. At last, we combine our disk backup method and the memory checkpoint function of virtualization platform together to accomplish the whole consistent checkpoint's backup. A prototype of our system, in which a non-volatile local buffer is introduced for both speed and reliability purpose, is implemented. And the experimental testing result shows that our system's running overhead to production system is sensible, especially the overhead can drop to only 0.5% when backup is done. Also by using our system, backup frequency of production system has the potential to reach over fifteen times per minute. All these can prove that our consistent backup mechanism is suitable for disaster recovery purpose in container based virtualization platform.","PeriodicalId":371382,"journal":{"name":"2012 Seventh ChinaGrid Annual Conference","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114835114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Web Service to Deliver Filtered RSS Items to a Mobile Application","authors":"Atul Sajjanhar, Ying Zhao","doi":"10.1109/ChinaGrid.2012.8","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2012.8","url":null,"abstract":"In the past decade there has been massive growth of data on the internet. Many people rely on XML based RSS feeds to receive updates from websites. In this paper, we propose a method for managing the RSS feeds from various news websites. A web service is developed to deliver filtered news items from RSS feeds to a mobile client. Each news item is indexed, subsequently, the indexes are used for filtering news items. Indexing is done in two steps. First, classical text categorization algorithms are used to assign a category to each news item, second, geoparsing is used to assign geolocation data to each news item. An android application is developed to access filtered news items by consuming the proposed web service. A prototype is implemented using Rapid miner 5.0 as the data mining tool and SVM as the classification algorithm. Geoparsing and geocoding web services, and Android API are used to implement location-based access to news items. Experimental results prove that the proposed approach is effective and saves a significant amount of information overload processing time.","PeriodicalId":371382,"journal":{"name":"2012 Seventh ChinaGrid Annual Conference","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128262204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yujie Xu, Peng Zou, W. Qu, Zhiyang Li, Keqiu Li, Xiaoli Cui
{"title":"Sampling-Based Partitioning in MapReduce for Skewed Data","authors":"Yujie Xu, Peng Zou, W. Qu, Zhiyang Li, Keqiu Li, Xiaoli Cui","doi":"10.1109/ChinaGrid.2012.18","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2012.18","url":null,"abstract":"MapReduce, as a popular tool for distributed and scalable processing of voluminous data, has been used in many areas. However, it is not efficient when handing skewed data, since it only considers the key and adopts a uniform hash method to distribute the workload to each reducer, while ignores the key's distribution. This can lead to load imbalance, increase the processing time, generate the \"straggler\" and the final result is the performance degradation. The current approach to solve this problem usually adopts the asynchronous Map and Reduce to gather the distribution of keys' frequencies and make a partition scheme in advance, but it will cost too much waiting time. In this paper, we address the problem of how to efficiently and effectively partition the intermediate key to balance the load of each reducer when skewed data exists. We use a sampling MapReduce job to gather the distribution of keys'frequencies, estimate the overall distribution and make a partition scheme in advance. Then, we apply it to the map phase of the expected MapReduce job. This design not only provides a load-balanced partition scheme, but also keeps the high performance of synchronous mode in MapReduce. We also propose two partition schemes based on the sampling results in this paper: cluster combination optimization and cluster partition combination. The experimental results show that the first partition scheme is suitable for the data set that has a lighter skew, while cluster partition combination has a greater time and load balancing advantage when the data skew is heavier.","PeriodicalId":371382,"journal":{"name":"2012 Seventh ChinaGrid Annual Conference","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129835855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Behavioral Compatibility Analysis for Context-independent Service Substitution","authors":"Jun Jin, Jingjing Hu, Yuanda Cao, Jingxia Wang","doi":"10.1109/ChinaGrid.2012.33","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2012.33","url":null,"abstract":"Service composition and substitution are major research fields in Service Oriented Computing (SOC). Behavioral compatibility is very important. This paper proposes a context-dependent behavioral substitutability analysis approach. Web service processes are modeled by Petri nets. By analyzing the partial orders of transitions for the substituted service, several algorithms are given for temporal constraints generation. And this paper proves that if new services satisfy the temporal constraints, the new composite service must be sound and deadlock-free. To explain the algorithms clearly, a detailed example is shown. The results and algorithms can be used to improve the existing methods of service substitution verification.","PeriodicalId":371382,"journal":{"name":"2012 Seventh ChinaGrid Annual Conference","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121491771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Improved Web Service Recommendation and Consumption Approach","authors":"Yueming Zhu, Ruisheng Zhang, Rongjing Hu, Zhili Zhao, Jiazao Lin, Shuyi Zhang","doi":"10.1109/ChinaGrid.2012.31","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2012.31","url":null,"abstract":"This work proposes an improved Web Service (WS) recommendation and consumption architecture in which the functional and nonfunctional requirements together with social network features are taken into consideration during the service recommendation and consumption. The proposed architecture concludes a database to handle with modified WSDL storage, a mechanism to improve the recommendation and consumption via social networking principles which is the added information for optimizing the QoS. It is able to take advantage of old users experience as WS quality information to provide other users better trustworthy and more appropriate WS via an independent XML file which contains not only the initial WSDL published by the service provider but also the quality of service (QoS) information of the WS come from widely users' experience. A recommendation module is also presented that delivers the WS that maximizes the value of QoS characteristics among others with the same functionality or name. An experimental prototype is presented.","PeriodicalId":371382,"journal":{"name":"2012 Seventh ChinaGrid Annual Conference","volume":"243 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113986918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Applying Multi-cell MD Method to Shared-Memory Multi-core Systems","authors":"Myongchan Kim, Shucai Yao, Gwang-Ung Go","doi":"10.1109/ChinaGrid.2012.34","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2012.34","url":null,"abstract":"For decades, the main parallelization systems have been message passing ones, in which there are many parallel methods whose efficiencies are very high. The multi-cell MD (Molecular Dynamics) method is also a parallel one that is widely used for molecular dynamics simulation on the message passing systems. Cluster is still the main environment for parallel computing and every node in the cluster is generally the shared-memory multi-core computer. In this paper, we propose a data structure and a method so that the multi-cell MD method can be also used on the shared-memory multi-core systems. The proposed data structure and method are designed on the basis of the Torus network structure, the characteristic of the multi-cell MD method and a work sharing structure of OpenMP. Experimental results show that our method allows the multi-cell MD method to be used on the shared-memory multi-core systems and the parallel efficiency per a time step of that is increased about 10% than the HS (Half-Shell) method.","PeriodicalId":371382,"journal":{"name":"2012 Seventh ChinaGrid Annual Conference","volume":"182 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116654293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ran Zheng, Qiongyao Zhang, Hai Jin, Zhiyuan Shao, Xiaowen Feng
{"title":"Parallelization Mechanisms of Neighbor-Joining for CUDA Enabled Devices","authors":"Ran Zheng, Qiongyao Zhang, Hai Jin, Zhiyuan Shao, Xiaowen Feng","doi":"10.1109/ChinaGrid.2012.32","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2012.32","url":null,"abstract":"Multiple Sequence Alignment (MSA) is a fundamental process in bioinformatics in which phylogenetic tree reconstruction is an essential operation. Neighbor-Joining algorithm is the best approach to reconstruct phylogenetic tree with its less time and space costs. With the rapid increase of biological sequences, it will take many hours or even days to reconstruct phylogenetic tree because of the complex computing for multiple sequence alignment. In this paper, two mechanisms for parallelizing Neighbor-Joining algorithm are proposed based on CUDA to get higher performance of lower time and space costs. Data dependency is reduced by converting the running mode and dynamic multiple granularity mechanism is used to figure out imbalance guiding tree with lower rate of resources occupation and higher efficiency. The parallelization mechanisms have achieved average speedups of 18.6 for thousands of datasets as well as far genetic relationship datasets compared to the basic method.","PeriodicalId":371382,"journal":{"name":"2012 Seventh ChinaGrid Annual Conference","volume":"46 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131595824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multiple DAGs Scheduling Based on Lowest Transportation and Completion Time Algorithm on the Cloud","authors":"Fengbo Ren, Jiong Yu","doi":"10.1109/ChinaGrid.2012.7","DOIUrl":"https://doi.org/10.1109/ChinaGrid.2012.7","url":null,"abstract":"According to multiple DAG work Flow scheduling problem in heterogeneous distributed environments, in this paper, proposed a scheduling algorithm based on minimize the data transmission time and task completion time, which can deal with the problem that multiple DAGs workflow have the same priority, and gives the multi-priority multi-DAG mixed scheduling algorithm. Compared with E-Fairness algorithm, the experiments show that on the basis of fairness to ensure multiple DAGs scheduling, this algorithm can avoid additional data transfer overhead, shorten the entire workflow execution Make span, and improve resource utilization.","PeriodicalId":371382,"journal":{"name":"2012 Seventh ChinaGrid Annual Conference","volume":"123 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116433613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}