Raja Appuswamy, D. V. Moolenbroek, Sharan Santhanam, A. Tanenbaum
{"title":"File-Level, Host-Side Flash Caching with Loris","authors":"Raja Appuswamy, D. V. Moolenbroek, Sharan Santhanam, A. Tanenbaum","doi":"10.1109/ICPADS.2013.18","DOIUrl":"https://doi.org/10.1109/ICPADS.2013.18","url":null,"abstract":"As enterprises shift from using direct-attached storage to network-based storage for housing primary data, flash-based, host-side caching has gained momentum as the primary latency reduction technique. In this paper, we make the case for integration of flash caching algorithms at the file level, as opposed to the conventional block-level integration. In doing so, we will show how our extensions to Loris, a reliable, file-oriented storage stack, transform it into a framework for designing layout-independent, file-level caching systems. Using our Loris prototype, we demonstrate the effectiveness of Loris-based, file-level flash caching systems over their block-level counterparts, and investigate the effect of various write and allocation policies on the overall performance.","PeriodicalId":160979,"journal":{"name":"2013 International Conference on Parallel and Distributed Systems","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116492006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OPTAS: Optimal Data Placement in MapReduce","authors":"Changjian Wang, Yongrui Qin, Zhen Huang, Yuxing Peng, Dongsheng Li, Huiba Li","doi":"10.1109/ICPADS.2013.52","DOIUrl":"https://doi.org/10.1109/ICPADS.2013.52","url":null,"abstract":"The data placement strategy greatly affects the efficiency of MapReduce. The current strategy only takes the map phase into account to optimize the map time. But the ignored shuffle phase may increase the total running time significantly in many jobs. We propose a new data placement strategy, named OPTAS, which optimizes both the map and shuffle phases to reduce their total time. However, the huge search space makes it difficult to find out an optimal data placement instance (DPI) rapidly. To address this problem, an algorithm is proposed which can prune most of the search space and find out an optimal result quickly. The search space firstly is segmented in ascending order according to the potential map time. Within each segment, we propose an efficient method to construct a local optimal DPI with the minimal total time of both the map and shuffle phases. To find the global optimal DPI, we scan the local optimal DPIs in order. We have proven that the global optimal DPI can be found as the first local optimal DPI whose total time stops decreasing, thus further pruning the search space. In practice, we find that at most fourteen local optimal DPIs are scanned in tens of thousands of segments with the pruning strategy. Extensive experiments with real trace data verify not only the theoretic analysis of our pruning strategy and construction method but also the optimality of OPTAS. The best improvements obtained in our experiments can be over 40% compared with the existing strategy used by MapReduce.","PeriodicalId":160979,"journal":{"name":"2013 International Conference on Parallel and Distributed Systems","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128089864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Accelerating the Calculation of Scattering of Complex Targets from Background Radiation with CUDA, OpenACC and OpenHMPP","authors":"Xing Guo, Zhensen Wu, Jiaji Wu","doi":"10.1109/ICPADS.2013.125","DOIUrl":"https://doi.org/10.1109/ICPADS.2013.125","url":null,"abstract":"Graphics Processing Unit (GPU) is used to accelerate the calculation of scattering of complex target from background radiation in infrared spectrum. Compute Unified Device Architecture (CUDA), OpenACC, and Hybrid Multicore Parallel Programming (OpenHMPP) implementations are presented. In all our implementation, scattering of background radiation in different directions are calculated in parallel. A personal desktop with 2 NVIDIA GTX GeForce 590 with an Intel i7 CPU is used in our experiment. In CUDA, by using shared memory to buffer the background radiation and BRDF parameters and tuning the grid organization, we achieve a speedup of 197x. OpenACC implementation is realized by inserting the parallel loop construct with reduction clause before the loop in original serial code. By utilization of data clause and tuning number of gangs used, a speedup of 158.9x is obtained. In OpenHMPP implementation, the loop iterating over incident direction of original code is transformed to the codelet function and we achieve a speedup of 160.7x. Our effort makes the calculation of complex target in real time possible.","PeriodicalId":160979,"journal":{"name":"2013 International Conference on Parallel and Distributed Systems","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133573437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Particle Swarm Optimization-Based Impurity Function Band Prioritization Using Weighted Majority Voting for Feature Extraction of High Dimensional Data Sets","authors":"Yang-Lang Chang, Min-Yu Huang, Ping-Hao Wang, Tung-Ju Hsieh, Jyh-Perng Fang, Bormin Huang","doi":"10.1109/ICPADS.2013.124","DOIUrl":"https://doi.org/10.1109/ICPADS.2013.124","url":null,"abstract":"In recent years, with the improvement of sensor technologies, the volumes of remote sensing data are increased dramatically. The feature extraction of hyper spectral remotely sensed images can reduce such high-dimensional datasets, solve the big data problem, avoid the Hughes phenomena and improve the classification performance. Accordingly, this paper presents a framework for feature extraction of hyper spectral imagery, which consists of two approaches, referred to as parallel particle swarm optimization (PPSO) band selection and weighted voting impurity function (WVIF) band prioritization. The highly correlated bands of hyper spectral imagery can be grouped first into the some modules by PPSO band selection algorithm to coarsely reduce high-dimensional datasets, and these highly correlated band modules can then be analyzed with the statistical relationship between bands and classes by WVIF band prioritization method to finely select the most important feature bands form the datasets. Furthermore, a PPSO algorithm based on modern graphics processing unit (GPU) architecture using NVIDIA compute unified device architecture (CUDA) technology is using in this paper. It can improve the computational speed of PPSO band selection to group the high correlated band modules. The effectiveness of the proposed PPSO/WVIF framework is evaluated by MASTER and AVIRIS hyper spectral images. The experimental results demonstrated that the proposed method not only could reduction the dimension of datasets, but also can offer a satisfactory classification performance and computational speed.","PeriodicalId":160979,"journal":{"name":"2013 International Conference on Parallel and Distributed Systems","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133451707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guang Suo, Yutong Lu, Xiangke Liao, Min Xie, H. Cao
{"title":"NR-MPI: A Non-stop and Fault Resilient MPI","authors":"Guang Suo, Yutong Lu, Xiangke Liao, Min Xie, H. Cao","doi":"10.1109/ICPADS.2013.37","DOIUrl":"https://doi.org/10.1109/ICPADS.2013.37","url":null,"abstract":"Fault resilience has became a major issue for HPC systems, in particular in the perspective of future E-scale systems, which will consist of millions of CPU cores and other components. Fault tolerant MPI was proposed to offer support of software level fault tolerance approaches. However, the widely used MPI implementations, such as MPICH and Mvapich2, provide limited support for fault tolerance. This paper proposes NR-MPI, a Non-stop and Fault Resilient MPI. NR-MPI implements the semantics of FT-MPI based on MPICH. Specifically, this paper focuses on failure detection in MPI library, online failure recovery of communicators for multiple failures, friendly programming interface extending for NR-MPI. Furthermore, to support failure recovery of applications, NR-MPI implements data backup and restore interfaces based on double in-memory checkpoint/restart. We conduct experiments with NPB benchmarks on TH-1A supercomputer. Experimental results show that NR-MPI based fault tolerant programs can recover from failures online without restarting, and the overhead is small even for applications with tens of thousands of cores.","PeriodicalId":160979,"journal":{"name":"2013 International Conference on Parallel and Distributed Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125895198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Application Aware DRAM Bank Partitioning in CMP","authors":"Takakazu Ikeda, Kenji Kise","doi":"10.1109/ICPADS.2013.56","DOIUrl":"https://doi.org/10.1109/ICPADS.2013.56","url":null,"abstract":"Main memory is a shared resource among cores in a chip and the speed gap between cores and main memory limits the total system performance. Thus, main memory should be effectively accessed by each core. Exploiting both parallelism and locality of main memory is the key to realize the efficient memory access. The parallelism between memory banks can hide the latency by pipelining memory accesses. The locality of memory accesses improves hit ratio of the row buffer in DRAM chips. The state-of-the-art method called bpart is proposed to improve memory access efficiency. In bpart one bank is monopolized by one thread and this monopolization improves row buffer locality because of alleviating inter-thread interference. However, bpart is not effective for the thread which has poor locality. Moreover, the bank level parallelism is not exploited. We propose the new bank partitioning method which exploits parallelism in addition to locality. Our method applies the two types of bank usage. One usage is that low locality threads share banks to improve parallelism, and the other usage is that each high locality thread monopolizes each bank to improve row buffer locality. We evaluate our proposed method by our in-house software simulator with SPEC CPU 2006 benchmark. On Average, system throughput is increased by 1.0% and minimum speedup (fairness metrics) is increased by 7.9% relative to bpart. This result shows that our porposed method has better performance and fairness than bpart.","PeriodicalId":160979,"journal":{"name":"2013 International Conference on Parallel and Distributed Systems","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122002350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cloud Governance - The Relevance of Cloud Brokers","authors":"V. Stantchev, G. Tamm","doi":"10.1109/.83","DOIUrl":"https://doi.org/10.1109/.83","url":null,"abstract":"This contribution provides an overview of cloud governance aspects based on a representative survey of trends in the perception, assessment and adoption of Cloud Computing within the enterprise. The survey was conducted through a two-fold research approach and combined a meta-study of existing and published empirical studies with an own empirical study within a representative group of European and International enterprises. We consider governance through the prism of adoption drivers and focus on the role of cloud brokers and cloud marketplaces more specifically.","PeriodicalId":160979,"journal":{"name":"2013 International Conference on Parallel and Distributed Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130222434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chih-Lin Hu, Hung-Tsung Huang, Cheng-Lung Lin, Nguyen Huu Minh Anh, Yi-Yu Su, Pin-Chuan Liu
{"title":"Design and Implementation of Media Content Sharing Services in Home-Based IoT Networks","authors":"Chih-Lin Hu, Hung-Tsung Huang, Cheng-Lung Lin, Nguyen Huu Minh Anh, Yi-Yu Su, Pin-Chuan Liu","doi":"10.1109/ICPADS.2013.108","DOIUrl":"https://doi.org/10.1109/ICPADS.2013.108","url":null,"abstract":"The penetration ratio of broadband networks into residential areas increases rapidly by the wide distribution of Internet service providers and networks. People are able to distribute and play various media content with many types of networked multimedia devices for home multimedia entertainment in residential environments. This paper addresses a new idea of home-based IoT networks where home-networked devices are able to communicate with others in a friendly, networked manner instead of traditional manual configurations and wired cabling operations. Accordingly, this paper proposes a novel intelligent media distribution system based on a home-based IoT network. The design of this system integrates UPnP, face recognition, intelligent human-machine interface, and family database technologies. UPnP-compatible HNDs With UPnP, networked devices can discover neighboring devices in a network. Face recognition is incorporated and so provides the UPnP networked devices with the capability of identifying the operating user in front of them. When a user moves in a home-based network, the intelligent human-machine interface allows a user to enforce any media content to be distributed to or displayed onto the UPnP-based device nearby the user. Furthermore, this paper presents a prototypical development, as well as a real demonstration with experimental UPnP-based network devices in home networks. Therefore, the study in this paper enables a ubiquitous media distribution service in home-based IoT network environments.","PeriodicalId":160979,"journal":{"name":"2013 International Conference on Parallel and Distributed Systems","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129940320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quantitative Trust Management to Support QoS-Aware Service Selection in Service-Oriented Environments","authors":"Yukyong Kim, Kyung-Goo Doh","doi":"10.1109/ICPADS.2013.91","DOIUrl":"https://doi.org/10.1109/ICPADS.2013.91","url":null,"abstract":"Owing to the black-box nature of services, selecting a trustworthy service that best fits users' requirements is greatly critical in service-oriented computing. Once a set of services fulfilling users' functional requirements are founded, one of these services invoked by the users depends mostly on the Quality of Services (QoS), particularly security, trust, and reputation. This paper proposes a trust management model to support service discovery and selection based on QoS. We define a quantitative trust evaluating method for dynamic service discovery and selection. The proposed model makes service consumers get trustworthy services possible. Our mechanism uses consumers' feedback to describe the trust degree of services and service providers. The service selection using the quantitative measurement rather than consumers' intuition allows selecting a highly reliable service accomplishing their quality requirements well. Finally, we give experimental results.","PeriodicalId":160979,"journal":{"name":"2013 International Conference on Parallel and Distributed Systems","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120978532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"UserScope: A Fine-Grained Framework for Collecting Energy-Related Smartphone User Contexts","authors":"Wonwoo Jung, Kwanghwan Kim, H. Cha","doi":"10.1109/ICPADS.2013.33","DOIUrl":"https://doi.org/10.1109/ICPADS.2013.33","url":null,"abstract":"To prolong the battery lifetime of modern mobile devices, the energy management policy should be developed in a personalized way, adequately reflecting user context or the energy behavior of the user. The first step toward this personalization is to collect the relevant information, accurately and efficiently, from the device. This paper presents a fine-grained and low-overhead framework, called UserScope, which is designed to collect energy-related user contexts in Android smartphones. We classified energy-related smart phone usage and designed an appropriate set of monitoring parameters to collect from the system. The UserScope core is then implemented as a kernel module to collect all the necessary information in an event-driven manner. This kernel-level implementation ensures monitoring accuracy and low system overhead. UserScope also provides a data-sharing mechanism with which other software components in the system can easily interface. Our experiments show that User Scope accurately extracts energy related system information with 0.8% CPU overhead. The practicality of UserScope is also validated with real deployment and subsequent analysis of the collected data.","PeriodicalId":160979,"journal":{"name":"2013 International Conference on Parallel and Distributed Systems","volume":"2 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122598906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}