{"title":"A case study of MapReduce speculation for failure recovery","authors":"Huansong Fu, Yue Zhu, Weikuan Yu","doi":"10.1145/2831244.2831245","DOIUrl":"https://doi.org/10.1145/2831244.2831245","url":null,"abstract":"MapReduce has become indispensable for big data analytics. As a representative implementation of MapReduce, Hadoop/YARN strives to provide outstanding performance in terms of job turnaround time, fault tolerance etc. It is equipped with a speculation mechanism to cope with run-time exceptions and failures. However, we reveal that the existing speculation mechanism has some major drawbacks that hinder its efficiency during failure recovery, which we refer to as the speculation breakdown. In order to address the speculation breakdown, we introduce a failure-aware speculation scheme and a refined scheduling policy. Moreover, we have conducted a comprehensive set of experiments to evaluate the performance of both single component and the whole framework. Our experimental results show that our new framework achieves dramatic performance improvement in handling with task and node failures compared with the original YARN.","PeriodicalId":166804,"journal":{"name":"International Symposium on Design and Implementation of Symbolic Computation Systems","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132944170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hassan Eslami, Anthony Kougkas, Maria Kotsifakou, T. Kasampalis, Kun Feng, Yin Lu, W. Gropp, Xian-He Sun, Yong Chen, R. Thakur
{"title":"Efficient disk-to-disk sorting: a case study in the decoupled execution paradigm","authors":"Hassan Eslami, Anthony Kougkas, Maria Kotsifakou, T. Kasampalis, Kun Feng, Yin Lu, W. Gropp, Xian-He Sun, Yong Chen, R. Thakur","doi":"10.1145/2831244.2831249","DOIUrl":"https://doi.org/10.1145/2831244.2831249","url":null,"abstract":"Many applications foreseen for exascale era should process huge amount of data. However, the IO infrastructure of current supercomputing architecture cannot be generalized to deal with this amount of data due to the need for excessive data movement from storage layers to compute nodes leading to limited scalability. There has been extensive studies addressing this challenge. Decoupled Execution Paradigm (DEP) is an attractive solution due to its unique features such as available fast storage devices close to computational units and available programmable units close to file system.\u0000 In this paper we study the effectiveness of DEP for a well-known data-intensive kernel, disk-to-disk (aka out-of-core) sorting. We propose an optimized algorithm that uses almost all features of DEP pushing the performance of sorting in HPC even further compared to other existing solutions. Advantages in our algorithm are gained by exploiting programming units close to parallel file system to achieve higher IO throughput, compressing data before sending it over network or to disk, storing intermediate results of computation close to compute nodes, and fully overlapping IO with computation. We also provide an analytical model for our proposed algorithm. Our algorithm achieves 30% better performance compared to the theoretically optimal sorting algorithm running on the same testbed but not designed to exploit the DEP architecture.","PeriodicalId":166804,"journal":{"name":"International Symposium on Design and Implementation of Symbolic Computation Systems","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116244764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Francisco Rodrigo Duro, Francisco Javier García Blas, Florin Isaila, J. Carretero
{"title":"Experimental evaluation of a flexible I/O architecture for accelerating workflow engines in cloud environments","authors":"Francisco Rodrigo Duro, Francisco Javier García Blas, Florin Isaila, J. Carretero","doi":"10.1145/2831244.2831248","DOIUrl":"https://doi.org/10.1145/2831244.2831248","url":null,"abstract":"In the current scientific computing scenario storage systems are one of the main bottlenecks in computing platforms. This issue affects both traditional high performance computing systems and modern systems based on cloud platforms. Accelerating the I/O subsystems can improve the overall performance of the applications. In this paper, we present Hercules as an I/O accelerator specially designed for improving I/O access in workflow engines deployed over cloud-based infraestructures. Hercules provides a dynamic and flexible in-memory storage platform based on NoSQL-based distributed memory systems. In addition, Hercules offers a user-level interface based on POSIX for facilitating its usage on existing solutions and legacy applications. We have evaluated the proposed solution in a public cloud environment, in this case Amazon EC2. The results show that Hercules provides a scalable I/O solution with remarkable performance, especially for write operations, compared with classic I/O approaches for high performance computing in cloud environments.","PeriodicalId":166804,"journal":{"name":"International Symposium on Design and Implementation of Symbolic Computation Systems","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128093679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance evaluation and tuning of BioPig for genomic analysis","authors":"Lizhen Shi, Zhong Wang, Weikuan Yu, Xiandong Meng","doi":"10.1145/2831244.2831252","DOIUrl":"https://doi.org/10.1145/2831244.2831252","url":null,"abstract":"In this study, we aim to optimize Hadoop parameters to improve the performance of BioPig on Amazon Web Service (AWS). BioPig is a toolkit for large-scale sequencing data analysis and is built on Hadoop and Pig that enables easy parallel programming and scaling to datasets of terabyte sizes. AWS is the most popular cloud-computing platform offered by Amazon. When running BioPig jobs on AWS, the default configuration parameters may lead to high computational costs. We select the k-mer counting as it is used in a large number of next generation sequence (NGS) data analysis tools. We tuned Hadoop parameters from five different perspectives based on a baseline configuration. We found tuning different Hadoop parameters led to various performance improvements. The overall job execution time of k-mer counting on BioPig was reduced by 50% using an optimized set of parameters. This paper documents our tuning experiments as a valuable reference for future Hadoop-based analytics applications on genomics datasets.","PeriodicalId":166804,"journal":{"name":"International Symposium on Design and Implementation of Symbolic Computation Systems","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123992778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SJM: an SCM-based journaling mechanism with write reduction for file systems","authors":"Lingfang Zeng, Binbing Hou, D. Feng, K. Kent","doi":"10.1145/2831244.2831246","DOIUrl":"https://doi.org/10.1145/2831244.2831246","url":null,"abstract":"Considering the unique characteristics of storage class memory (SCM), such as non-volatility, fast access speed, byte-addressability, low-energy consumption, and in-place modification support, we investigated the features of over-write and append-write and propose a safe and write-efficient SCM-based journaling mechanism for a file system called SJM. SJM integrates the ordered and journaling modes of the traditional journaling mechanisms by storing the metadata and over-write data in the SCM-based logging device as a write-ahead log and strictly controlling the data flow. SJM writes back the valid log blocks to the file system according to their access frequency and sequentiality and thus improves the write performance. We implemented SJM on Linux 3.12 with ext2, which has no journal mechanisms. Evaluation results show that ext2 with SJM outperforms ext3 with a ramdisk-based journaling device while keeping the version consistency, especially under workloads with large write requests.","PeriodicalId":166804,"journal":{"name":"International Symposium on Design and Implementation of Symbolic Computation Systems","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134267467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A low-cost adaptive data separation method for the flash translation layer of solid state drives","authors":"Wei Xie, Yong Chen, P. Roth","doi":"10.1145/2831244.2831250","DOIUrl":"https://doi.org/10.1145/2831244.2831250","url":null,"abstract":"Solid state drives (SSDs) have shown great potential for data-intensive computing due to their much higher throughput and lower energy consumption compared to traditional hard disk drives. Within an SSD, its Flash Translation Layer (FTL) is responsible for exposing the SSD's flash memory storage to the computer system as a simple block device. The FTL design is one of the dominant factors determining an SSD's lifespan and the amount of performance degradation. To deliver better performance, we propose a new, low-cost, adaptive separation-aware flash translation layer (ASA-FTL) that combines data clustering and selective caching of recency information to accurately identify and separate hot/cold data while incurring minimal overhead. Using simulations of ASA-FTL with real-world workloads, we have shown that our proposed approach reduces the garbage collection overhead by up to 28% and the overall response time by 15% compared to one of the most advanced existing FTLs.","PeriodicalId":166804,"journal":{"name":"International Symposium on Design and Implementation of Symbolic Computation Systems","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131838192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Route-aware independent MPI I/O on the blue gene/Q","authors":"Preeti Malakar, V. Vishwanath","doi":"10.1145/2831244.2831251","DOIUrl":"https://doi.org/10.1145/2831244.2831251","url":null,"abstract":"Scalable high-performance I/O is crucial for application performance on large-scale systems. With the growing complexity of the system interconnects, it has become important to consider the impact of network contention on I/O performance because the I/O messages traverse several hops in the interconnect before reaching the I/O nodes or the file system. In this work, we present a route-aware and load-aware algorithm to modify existing bridge node assignment in the Blue Gene/Q (BG/Q) supercomputer. We reduce the network contention and reduce the write time by an average of 60% over the default independent I/O and by 20% over collective I/O on up to 8192 nodes on the Mira BG/Q system. Our algorithm routes 1.4x fewer messages through the bridge nodes which connect to the I/O nodes on the BG/Q.","PeriodicalId":166804,"journal":{"name":"International Symposium on Design and Implementation of Symbolic Computation Systems","volume":"141 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129308777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Supporting online analytics with user-defined estimation and early termination in a MapReduce-like framework","authors":"Yi Wang, Linchuan Chen, G. Agrawal","doi":"10.1145/2831244.2831247","DOIUrl":"https://doi.org/10.1145/2831244.2831247","url":null,"abstract":"Online analytics based on runtime approximation has been widely adopted for meeting time and/or resource constraints. Though MapReduce has been gaining its popularity in both scientific and commercial sectors, there are several obstacles in implementing online analytics in a MapReduce implementation.\u0000 In this paper, we present a MapReduce-like framework for online analytics. Our system can process the input incrementally, provide fast estimates, and terminate the execution as soon as a user-defined termination state is reached. We have extended the MapReduce API by allowing the user to customize both the estimation method and termination condition. We also have shown both the functionality and efficiency of our system through three approximate applications. A comparison with a batch processing implementation shows a speedup of at least an order of magnitude.","PeriodicalId":166804,"journal":{"name":"International Symposium on Design and Implementation of Symbolic Computation Systems","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127870826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pengfei Xuan, Jeffrey Denton, Feng Luo, P. Srimani
{"title":"Big data analytics on traditional HPC infrastructure using two-level storage","authors":"Pengfei Xuan, Jeffrey Denton, Feng Luo, P. Srimani","doi":"10.1145/2831244.2831253","DOIUrl":"https://doi.org/10.1145/2831244.2831253","url":null,"abstract":"Data-intensive computing has become one of the major workloads on traditional high-performance computing (HPC) clusters. Currently, deploying data-intensive computing software framework on HPC clusters still faces performance and scalability issues. In this paper, we develop a new two-level storage system by integrating Tachyon, an in-memory file system with OrangeFS, a parallel file system. We model the I/O throughputs of four storage structures: HDFS, OrangeFS, Tachyon and two-level storage. We conduct computational experiments to characterize I/O throughput behavior of two-level storage and compare its performance to that of HDFS and OrangeFS, using TeraSort benchmark. Theoretical models and experimental tests both show that the two-level storage system can increase the aggregate I/O throughputs. This work lays a solid foundation for future work in designing and building HPC systems that can provide a better support on I/O intensive workloads with preserving existing computing resources.","PeriodicalId":166804,"journal":{"name":"International Symposium on Design and Implementation of Symbolic Computation Systems","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129516949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing both fairness and performance using rate-aware dynamic storage cache partitioning","authors":"Yong Li, D. Feng, Zhan Shi","doi":"10.1145/2534645.2534650","DOIUrl":"https://doi.org/10.1145/2534645.2534650","url":null,"abstract":"In this paper, we investigate the problem of fair storage cache allocation among multiply competing applications with diversified access rates. Commonly used cache replacement policies like LRU and most LRU variants are inherently unfair in cache allocation for heterogenous applications. They implicitly give more cache to the applications that has high access rate and less cache to the applications of slow access rate. However, applications of fast access rate do not always gain higher performance from the additional cache blocks. In contrast, the slow application suffer poor performance with a reduced cache size. It is beneficial in terms of both performance and fairness to allocate cache blocks by their utility.\u0000 In this paper, we propose a partition-based cache management algorithm for a shared cache. The goal of our algorithm is to find an allocation such that all heterogenous applications can achieve a specified fairness degree, while maximizing the overall performance. To achieve this goal, we present an adaptive partition framework, which partitions the shared cache among competing applications and dynamic adjusts the partition size based on predicted utility on both fairness and performance. We implemented our algorithm in a storage simulator and evaluated the fairness and performance with various workloads. Experimental results show that, compared with LRU, our algorithm achieves large improvement in fairness and slightly in performance.","PeriodicalId":166804,"journal":{"name":"International Symposium on Design and Implementation of Symbolic Computation Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128827695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}