International Symposium on Design and Implementation of Symbolic Computation Systems最新文献

筛选
英文 中文
A case study of MapReduce speculation for failure recovery MapReduce对故障恢复的推测案例研究
Huansong Fu, Yue Zhu, Weikuan Yu
{"title":"A case study of MapReduce speculation for failure recovery","authors":"Huansong Fu, Yue Zhu, Weikuan Yu","doi":"10.1145/2831244.2831245","DOIUrl":"https://doi.org/10.1145/2831244.2831245","url":null,"abstract":"MapReduce has become indispensable for big data analytics. As a representative implementation of MapReduce, Hadoop/YARN strives to provide outstanding performance in terms of job turnaround time, fault tolerance etc. It is equipped with a speculation mechanism to cope with run-time exceptions and failures. However, we reveal that the existing speculation mechanism has some major drawbacks that hinder its efficiency during failure recovery, which we refer to as the speculation breakdown. In order to address the speculation breakdown, we introduce a failure-aware speculation scheme and a refined scheduling policy. Moreover, we have conducted a comprehensive set of experiments to evaluate the performance of both single component and the whole framework. Our experimental results show that our new framework achieves dramatic performance improvement in handling with task and node failures compared with the original YARN.","PeriodicalId":166804,"journal":{"name":"International Symposium on Design and Implementation of Symbolic Computation Systems","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132944170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Efficient disk-to-disk sorting: a case study in the decoupled execution paradigm 高效的磁盘到磁盘排序:解耦执行范例中的一个案例研究
Hassan Eslami, Anthony Kougkas, Maria Kotsifakou, T. Kasampalis, Kun Feng, Yin Lu, W. Gropp, Xian-He Sun, Yong Chen, R. Thakur
{"title":"Efficient disk-to-disk sorting: a case study in the decoupled execution paradigm","authors":"Hassan Eslami, Anthony Kougkas, Maria Kotsifakou, T. Kasampalis, Kun Feng, Yin Lu, W. Gropp, Xian-He Sun, Yong Chen, R. Thakur","doi":"10.1145/2831244.2831249","DOIUrl":"https://doi.org/10.1145/2831244.2831249","url":null,"abstract":"Many applications foreseen for exascale era should process huge amount of data. However, the IO infrastructure of current supercomputing architecture cannot be generalized to deal with this amount of data due to the need for excessive data movement from storage layers to compute nodes leading to limited scalability. There has been extensive studies addressing this challenge. Decoupled Execution Paradigm (DEP) is an attractive solution due to its unique features such as available fast storage devices close to computational units and available programmable units close to file system.\u0000 In this paper we study the effectiveness of DEP for a well-known data-intensive kernel, disk-to-disk (aka out-of-core) sorting. We propose an optimized algorithm that uses almost all features of DEP pushing the performance of sorting in HPC even further compared to other existing solutions. Advantages in our algorithm are gained by exploiting programming units close to parallel file system to achieve higher IO throughput, compressing data before sending it over network or to disk, storing intermediate results of computation close to compute nodes, and fully overlapping IO with computation. We also provide an analytical model for our proposed algorithm. Our algorithm achieves 30% better performance compared to the theoretically optimal sorting algorithm running on the same testbed but not designed to exploit the DEP architecture.","PeriodicalId":166804,"journal":{"name":"International Symposium on Design and Implementation of Symbolic Computation Systems","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116244764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Experimental evaluation of a flexible I/O architecture for accelerating workflow engines in cloud environments 云环境中用于加速工作流引擎的灵活I/O架构的实验评估
Francisco Rodrigo Duro, Francisco Javier García Blas, Florin Isaila, J. Carretero
{"title":"Experimental evaluation of a flexible I/O architecture for accelerating workflow engines in cloud environments","authors":"Francisco Rodrigo Duro, Francisco Javier García Blas, Florin Isaila, J. Carretero","doi":"10.1145/2831244.2831248","DOIUrl":"https://doi.org/10.1145/2831244.2831248","url":null,"abstract":"In the current scientific computing scenario storage systems are one of the main bottlenecks in computing platforms. This issue affects both traditional high performance computing systems and modern systems based on cloud platforms. Accelerating the I/O subsystems can improve the overall performance of the applications. In this paper, we present Hercules as an I/O accelerator specially designed for improving I/O access in workflow engines deployed over cloud-based infraestructures. Hercules provides a dynamic and flexible in-memory storage platform based on NoSQL-based distributed memory systems. In addition, Hercules offers a user-level interface based on POSIX for facilitating its usage on existing solutions and legacy applications. We have evaluated the proposed solution in a public cloud environment, in this case Amazon EC2. The results show that Hercules provides a scalable I/O solution with remarkable performance, especially for write operations, compared with classic I/O approaches for high performance computing in cloud environments.","PeriodicalId":166804,"journal":{"name":"International Symposium on Design and Implementation of Symbolic Computation Systems","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128093679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Performance evaluation and tuning of BioPig for genomic analysis 用于基因组分析的BioPig性能评估和调整
Lizhen Shi, Zhong Wang, Weikuan Yu, Xiandong Meng
{"title":"Performance evaluation and tuning of BioPig for genomic analysis","authors":"Lizhen Shi, Zhong Wang, Weikuan Yu, Xiandong Meng","doi":"10.1145/2831244.2831252","DOIUrl":"https://doi.org/10.1145/2831244.2831252","url":null,"abstract":"In this study, we aim to optimize Hadoop parameters to improve the performance of BioPig on Amazon Web Service (AWS). BioPig is a toolkit for large-scale sequencing data analysis and is built on Hadoop and Pig that enables easy parallel programming and scaling to datasets of terabyte sizes. AWS is the most popular cloud-computing platform offered by Amazon. When running BioPig jobs on AWS, the default configuration parameters may lead to high computational costs. We select the k-mer counting as it is used in a large number of next generation sequence (NGS) data analysis tools. We tuned Hadoop parameters from five different perspectives based on a baseline configuration. We found tuning different Hadoop parameters led to various performance improvements. The overall job execution time of k-mer counting on BioPig was reduced by 50% using an optimized set of parameters. This paper documents our tuning experiments as a valuable reference for future Hadoop-based analytics applications on genomics datasets.","PeriodicalId":166804,"journal":{"name":"International Symposium on Design and Implementation of Symbolic Computation Systems","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123992778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
SJM: an SCM-based journaling mechanism with write reduction for file systems SJM:一种基于scm的日志记录机制,用于文件系统的写减少
Lingfang Zeng, Binbing Hou, D. Feng, K. Kent
{"title":"SJM: an SCM-based journaling mechanism with write reduction for file systems","authors":"Lingfang Zeng, Binbing Hou, D. Feng, K. Kent","doi":"10.1145/2831244.2831246","DOIUrl":"https://doi.org/10.1145/2831244.2831246","url":null,"abstract":"Considering the unique characteristics of storage class memory (SCM), such as non-volatility, fast access speed, byte-addressability, low-energy consumption, and in-place modification support, we investigated the features of over-write and append-write and propose a safe and write-efficient SCM-based journaling mechanism for a file system called SJM. SJM integrates the ordered and journaling modes of the traditional journaling mechanisms by storing the metadata and over-write data in the SCM-based logging device as a write-ahead log and strictly controlling the data flow. SJM writes back the valid log blocks to the file system according to their access frequency and sequentiality and thus improves the write performance. We implemented SJM on Linux 3.12 with ext2, which has no journal mechanisms. Evaluation results show that ext2 with SJM outperforms ext3 with a ramdisk-based journaling device while keeping the version consistency, especially under workloads with large write requests.","PeriodicalId":166804,"journal":{"name":"International Symposium on Design and Implementation of Symbolic Computation Systems","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134267467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A low-cost adaptive data separation method for the flash translation layer of solid state drives 一种用于固态硬盘闪存转换层的低成本自适应数据分离方法
Wei Xie, Yong Chen, P. Roth
{"title":"A low-cost adaptive data separation method for the flash translation layer of solid state drives","authors":"Wei Xie, Yong Chen, P. Roth","doi":"10.1145/2831244.2831250","DOIUrl":"https://doi.org/10.1145/2831244.2831250","url":null,"abstract":"Solid state drives (SSDs) have shown great potential for data-intensive computing due to their much higher throughput and lower energy consumption compared to traditional hard disk drives. Within an SSD, its Flash Translation Layer (FTL) is responsible for exposing the SSD's flash memory storage to the computer system as a simple block device. The FTL design is one of the dominant factors determining an SSD's lifespan and the amount of performance degradation. To deliver better performance, we propose a new, low-cost, adaptive separation-aware flash translation layer (ASA-FTL) that combines data clustering and selective caching of recency information to accurately identify and separate hot/cold data while incurring minimal overhead. Using simulations of ASA-FTL with real-world workloads, we have shown that our proposed approach reduces the garbage collection overhead by up to 28% and the overall response time by 15% compared to one of the most advanced existing FTLs.","PeriodicalId":166804,"journal":{"name":"International Symposium on Design and Implementation of Symbolic Computation Systems","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131838192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Route-aware independent MPI I/O on the blue gene/Q 蓝色基因/Q上的路由感知独立MPI I/O
Preeti Malakar, V. Vishwanath
{"title":"Route-aware independent MPI I/O on the blue gene/Q","authors":"Preeti Malakar, V. Vishwanath","doi":"10.1145/2831244.2831251","DOIUrl":"https://doi.org/10.1145/2831244.2831251","url":null,"abstract":"Scalable high-performance I/O is crucial for application performance on large-scale systems. With the growing complexity of the system interconnects, it has become important to consider the impact of network contention on I/O performance because the I/O messages traverse several hops in the interconnect before reaching the I/O nodes or the file system. In this work, we present a route-aware and load-aware algorithm to modify existing bridge node assignment in the Blue Gene/Q (BG/Q) supercomputer. We reduce the network contention and reduce the write time by an average of 60% over the default independent I/O and by 20% over collective I/O on up to 8192 nodes on the Mira BG/Q system. Our algorithm routes 1.4x fewer messages through the bridge nodes which connect to the I/O nodes on the BG/Q.","PeriodicalId":166804,"journal":{"name":"International Symposium on Design and Implementation of Symbolic Computation Systems","volume":"141 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129308777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Supporting online analytics with user-defined estimation and early termination in a MapReduce-like framework 在类似mapreduce的框架中,支持用户自定义估计和早期终止的在线分析
Yi Wang, Linchuan Chen, G. Agrawal
{"title":"Supporting online analytics with user-defined estimation and early termination in a MapReduce-like framework","authors":"Yi Wang, Linchuan Chen, G. Agrawal","doi":"10.1145/2831244.2831247","DOIUrl":"https://doi.org/10.1145/2831244.2831247","url":null,"abstract":"Online analytics based on runtime approximation has been widely adopted for meeting time and/or resource constraints. Though MapReduce has been gaining its popularity in both scientific and commercial sectors, there are several obstacles in implementing online analytics in a MapReduce implementation.\u0000 In this paper, we present a MapReduce-like framework for online analytics. Our system can process the input incrementally, provide fast estimates, and terminate the execution as soon as a user-defined termination state is reached. We have extended the MapReduce API by allowing the user to customize both the estimation method and termination condition. We also have shown both the functionality and efficiency of our system through three approximate applications. A comparison with a batch processing implementation shows a speedup of at least an order of magnitude.","PeriodicalId":166804,"journal":{"name":"International Symposium on Design and Implementation of Symbolic Computation Systems","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127870826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Big data analytics on traditional HPC infrastructure using two-level storage 基于两级存储的传统HPC基础设施的大数据分析
Pengfei Xuan, Jeffrey Denton, Feng Luo, P. Srimani
{"title":"Big data analytics on traditional HPC infrastructure using two-level storage","authors":"Pengfei Xuan, Jeffrey Denton, Feng Luo, P. Srimani","doi":"10.1145/2831244.2831253","DOIUrl":"https://doi.org/10.1145/2831244.2831253","url":null,"abstract":"Data-intensive computing has become one of the major workloads on traditional high-performance computing (HPC) clusters. Currently, deploying data-intensive computing software framework on HPC clusters still faces performance and scalability issues. In this paper, we develop a new two-level storage system by integrating Tachyon, an in-memory file system with OrangeFS, a parallel file system. We model the I/O throughputs of four storage structures: HDFS, OrangeFS, Tachyon and two-level storage. We conduct computational experiments to characterize I/O throughput behavior of two-level storage and compare its performance to that of HDFS and OrangeFS, using TeraSort benchmark. Theoretical models and experimental tests both show that the two-level storage system can increase the aggregate I/O throughputs. This work lays a solid foundation for future work in designing and building HPC systems that can provide a better support on I/O intensive workloads with preserving existing computing resources.","PeriodicalId":166804,"journal":{"name":"International Symposium on Design and Implementation of Symbolic Computation Systems","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129516949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Enhancing both fairness and performance using rate-aware dynamic storage cache partitioning 使用速率感知的动态存储缓存分区增强公平性和性能
Yong Li, D. Feng, Zhan Shi
{"title":"Enhancing both fairness and performance using rate-aware dynamic storage cache partitioning","authors":"Yong Li, D. Feng, Zhan Shi","doi":"10.1145/2534645.2534650","DOIUrl":"https://doi.org/10.1145/2534645.2534650","url":null,"abstract":"In this paper, we investigate the problem of fair storage cache allocation among multiply competing applications with diversified access rates. Commonly used cache replacement policies like LRU and most LRU variants are inherently unfair in cache allocation for heterogenous applications. They implicitly give more cache to the applications that has high access rate and less cache to the applications of slow access rate. However, applications of fast access rate do not always gain higher performance from the additional cache blocks. In contrast, the slow application suffer poor performance with a reduced cache size. It is beneficial in terms of both performance and fairness to allocate cache blocks by their utility.\u0000 In this paper, we propose a partition-based cache management algorithm for a shared cache. The goal of our algorithm is to find an allocation such that all heterogenous applications can achieve a specified fairness degree, while maximizing the overall performance. To achieve this goal, we present an adaptive partition framework, which partitions the shared cache among competing applications and dynamic adjusts the partition size based on predicted utility on both fairness and performance. We implemented our algorithm in a storage simulator and evaluated the fairness and performance with various workloads. Experimental results show that, compared with LRU, our algorithm achieves large improvement in fairness and slightly in performance.","PeriodicalId":166804,"journal":{"name":"International Symposium on Design and Implementation of Symbolic Computation Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128827695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信