{"title":"PSON: A Parallelized SON Algorithm with MapReduce for Mining Frequent Sets","authors":"Tao Xiao, C. Yuan, Y. Huang","doi":"10.1109/PAAP.2011.38","DOIUrl":"https://doi.org/10.1109/PAAP.2011.38","url":null,"abstract":"Many algorithms have been proposed in past decades to efficiently mine frequent sets in transaction database, including the SON Algorithm proposed by Savasere, Omiecinski and Navathe. This paper introduces the SON algorithm, explains why SON is very suitable to be parallelized, and illustrates how to adapt SON to the MapReduce paradigm. Then we propose a parallelized SON algorithm, PSON, and implement it in Hadoop. Our study suggests that PSON can mine frequent item sets from a very large database with good performance. The experimental results show that when performing frequent sets mining, the time cost will increase almost linearly with the size of the datasets and decrease with approximately linear trend with the number of cluster nodes. Consequently, we conclude that PSON works well for solving the frequent set mining problem from massive datasets with a good performance in both scalability and speed-up.","PeriodicalId":213010,"journal":{"name":"2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127820168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jingwei Hu, W. Guo, Jizeng Wei, Yisong Chang, Dazhi Sun
{"title":"A Novel Architecture for Fast RSA Key Generation Based on RNS","authors":"Jingwei Hu, W. Guo, Jizeng Wei, Yisong Chang, Dazhi Sun","doi":"10.1109/PAAP.2011.75","DOIUrl":"https://doi.org/10.1109/PAAP.2011.75","url":null,"abstract":"RSA key generation is of great concern for implementation of RSA cryptosystem on embeded system due to its long processing latency. In this paper, a novel architecture is presented to provide high processing speed to RSA key generation for embedded platform with limited processing capacity. In order to exploit more data level parallelism, Residue Number System (RNS) is introduced to accelerate RSA key pair generation, in which these independent elements can be processed simultaneously. A cipher processor based on Transport Triggered Architecture (TTA) is proposed to realized the parallelism at the architecture level.In the meantime,division is avoided in the proposed architecture,which reduces the expense of hardware implementation remarkably. The proposed design is implemented by Verilog HDL and synthesized in a 0.18µm CMOS process. A rate of 3 pairs per second can be achieved for 1024-bit RSA key generation at the frequency of 100 MHz.","PeriodicalId":213010,"journal":{"name":"2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming","volume":"158 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121241178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Parallel K-Means Clustering Algorithm with MPI","authors":"Jing Zhang, Gongqing Wu, Xuegang Hu, Shiying Li, Shuilong Hao","doi":"10.1109/PAAP.2011.17","DOIUrl":"https://doi.org/10.1109/PAAP.2011.17","url":null,"abstract":"Clustering is one of the most popular methods for data analysis, which is prevalent in many disciplines such as image segmentation, bioinformatics, pattern recognition and statistics etc. The most popular and simplest clustering algorithm is K-means because of its easy implementation, simplicity, efficiency and empirical success. However, the real-world applications produce huge volumes of data, thus, how to efficiently handle of these data in an important mining task has been a challenging and significant issue. In addition, MPI (Message Passing Interface) as a programming model of message passing presents high performances, scalability and portability. Motivated by this, a parallel K-means clustering algorithm with MPI, called MKmeans, is proposed in this paper. The algorithm enables applying the clustering algorithm effectively in the parallel environment. Experimental study demonstrates that MKmeans is relatively stable and portable, and it performs with low overhead of time on large volumes of data sets.","PeriodicalId":213010,"journal":{"name":"2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming","volume":"147 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115858294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Performance Evaluation of Cellular/WLAN Integrated Networks","authors":"Liying Yang, Guozhi Song, W. Jigang","doi":"10.1109/PAAP.2011.24","DOIUrl":"https://doi.org/10.1109/PAAP.2011.24","url":null,"abstract":"One of the key issues in cellular networks is the traffic load imbalance problem in the form of hot-spots caused by the different user mobility levels. A sound approach to address the problem currently is the integration of different heterogeneous networks, such as constructing a system via connecting cellular network and wireless local area network (WLAN) seamlessly. In general, the traffic volume is significantly heavier in hot-spots of cellular networks and a higher data transferring rate can be provided by introducing WLAN so as to raise the utilization of the channel and achieve a good balance between user satisfaction and the efficiency of network. In this paper, we analyze and evaluate the comprehensive performance of the systems both before and after the integration based upon an existing mathematical model, focusing on the quantitative analysis of changes in the performance of the system specially. In particular, the degree of improvement could be demonstrated precisely by calculation of several measure factors, i.e. call blocking probability and call dropping probability, and comparing with the ones calculated from a cellular-only network, and hence the efficiency and superiority of the integrated system is proved.","PeriodicalId":213010,"journal":{"name":"2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122690654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lieu My Chuong, Y. Aung, S. Lam, T. Srikanthan, C. Lim
{"title":"Automatic Compilation of C Applications for FPGA-Based Hardware Acceleration","authors":"Lieu My Chuong, Y. Aung, S. Lam, T. Srikanthan, C. Lim","doi":"10.1109/PAAP.2011.70","DOIUrl":"https://doi.org/10.1109/PAAP.2011.70","url":null,"abstract":"Advancement in design tools is necessary to bridge the widening productivity gap between hardware design and software development in state-of-the-art Field Programmable Gate Arrays (FPGA). We present a design exploration framework that automatically compiles C applications to realize efficient custom co-processor structures for hardware acceleration on the reconfigurable logic. We show that the proposed design exploration framework can automatically generate Register Transfer Level (RTL) codes from C-functions that outperform the commercial Altera C2H RTL generator by about 40% in terms of average area-time product.","PeriodicalId":213010,"journal":{"name":"2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126566657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DHFS: A High-Throughput Heterogeneous File System Based on Mainframe for Cloud Storage","authors":"Hongtao Du, Zhanhuai Li","doi":"10.1109/PAAP.2011.13","DOIUrl":"https://doi.org/10.1109/PAAP.2011.13","url":null,"abstract":"The file system for the Cloud Storage usually use the distributed structure which store the metadata and the user data respectively. The I/O nodes get the management information of file from the specific metadata server through the networks. Under this mode, the metadata server should be the performance bottleneck of the whole system when huge number nodes or the heavy workloads in Cloud Storage. Therefore, the capability of the metadata server become one of the key aspect for the whole system. In this paper, we present DHFS(Distribute Heterogeneous File System), a high throughput system for Cloud Storage, building metadata server on the high performance mainframe. For the different organization of the file system between mainframe and open system(Unix, Solaris, Windows, Linux, etc.), we implement the user-level library for the heterogeneous share of data which allow application in open system accessing files of mainframe. We also employ a series of strategies for the performance optimization.","PeriodicalId":213010,"journal":{"name":"2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132669505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Distributed Network Resources Monitoring Based on Multi-agent and Matrix Grammar","authors":"Weidong Min","doi":"10.1109/PAAP.2011.25","DOIUrl":"https://doi.org/10.1109/PAAP.2011.25","url":null,"abstract":"Network resources monitoring and management is critical to ensure security and load balance of network and information system, especially in the increasingly extensively used cloud computing and distributed parallel architecture. This paper presents a distributed network resources monitoring solution based on multi-agent and matrix grammar. A distributed multi-agent architecture for network resources monitoring is described. The paper proposes a generic matrix grammar which uses WMI, CIM and SNMP to remotely collect and manage data from network components. The matrix grammar provides a generic mechanism to describe what to be monitored, how to collect and process data. A monitoring automation engine consisting of a matrix analyzer and a recipe processor is described. The proposed solution has good extensibility, scalability, and enables monitoring automation and software reusability.","PeriodicalId":213010,"journal":{"name":"2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122245879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving Parallel FDTD Method Performance Using SSE Instructions","authors":"Lihong Zhang, Wenhua Yu","doi":"10.1109/PAAP.2011.16","DOIUrl":"https://doi.org/10.1109/PAAP.2011.16","url":null,"abstract":"Electromagnetic researchers are often faced with long execution time and therefore algorithmic and implementation-level optimization can dramatically increase the overall performance of electromagnetism simulation using FDTD method. In this paper, we focus on acceleration implementation of 3D parallel FDTD method by taking advantage of the extended instruction sets found in modern processors, in particular the SSE instruction set. We present a SSE version of 3D Parallel FDTD Method that results in a considerable 3x speedup.","PeriodicalId":213010,"journal":{"name":"2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126860340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Inter-domain Communication Mechanism Design and Implementation for High Performance","authors":"Jianbao Ren, Yong Qi, Yue-hua Dai, Xuan Yu","doi":"10.1109/PAAP.2011.41","DOIUrl":"https://doi.org/10.1109/PAAP.2011.41","url":null,"abstract":"Running multi-OS on a physical machine is the major method to improve the utilization of computer. With the widely use of virtualization technology in cloud computing, the efficiency of inter-domain communication becomes the key factor for performance of distributed applications especially for some network-intensive applications. The communication synchronous mechanism used by traditional VMM is based on asynchronous signal provided by VMM and often leads to high latency, low performance. In this paper, we design and implement a communication mechanism named OSVSocket which uses inter-processor interruption(IPI) to synchronize and eliminate some useless packet check. We use shared-memory to reduce the time for data copying. Our prototype is implemented on a X86 VMM which is developed by ourselves. The experiment shows that OSVSocket has lower latency and higher performance compared with UNIX IPC.","PeriodicalId":213010,"journal":{"name":"2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132127229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Topological Model for Grayscale Image Transformation","authors":"Jinghong Pan, Xiaoyuan Yang","doi":"10.1109/PAAP.2011.63","DOIUrl":"https://doi.org/10.1109/PAAP.2011.63","url":null,"abstract":"In this paper, we work on the topological properties such as the connectivity of regions, the boundaries and the adjacency for 2-D grayscale images. We present some original ideas for applying topology methods onto the graylevel image transformation which we call it pansystem topology. We use a Pansystem Parental model to develop an algorithm for grayscale image transformation. We also analyze the procedure of using pansystem clustering model to calculate connected components. The developed model can be well applied to biology image application such as image X-Ray image segmentation and cell counting.","PeriodicalId":213010,"journal":{"name":"2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133926391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}