{"title":"Multicast Scheduling with Markov Chains in Fat-Tree Data Center Networks","authors":"Guozhi Li, Songtao Guo, Guiyan Liu, Yuanyuan Yang","doi":"10.1109/NAS.2017.8026867","DOIUrl":"https://doi.org/10.1109/NAS.2017.8026867","url":null,"abstract":"Multicast can improve network performance by eliminating sending unnecessary duplicated flows in the data center networks (DCNs), thus it can significantly save network bandwidth and improve the network Quality of Service (QoS). However, the network multicast blocking causes the retransmission of a large number of data packets, and seriously influences the traffic efficiency of data center networks, especially for the multicast traffic in the fat-tree DCNs owing to multi-rooted tree structure. In this paper, we propose a novel multicast scheduling strategy to reduce the network multicast blocking. In order to decrease the operation time of the proposed algorithm, therefore, the remaining bandwidth the selected uplink connecting to available core switch should be close to and greater the three times than the bandwidth of multicast requests. Then the blocking probability of downlink at next time-slot is calculated by using markov chain theory. Furthermore, we select the downlink with minimum blocking probability as the optimal path at next time slot. In addition, theoretical analysis shows that the blocking probability of scheduling algorithm is close to zero and has lower time complexity. Simulation results verify the effectiveness of our proposed multicast scheduling algorithm.","PeriodicalId":222161,"journal":{"name":"2017 International Conference on Networking, Architecture, and Storage (NAS)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123898446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaozhao Zhuang, Xiaoyang Qu, Z. Lu, Ji-guang Wan, C. Xie
{"title":"Exploiting Virtual Metadata Servers to Provide Multi-Level Consistency for Key-Value Object-Based Data Store","authors":"Xiaozhao Zhuang, Xiaoyang Qu, Z. Lu, Ji-guang Wan, C. Xie","doi":"10.1109/NAS.2017.8026857","DOIUrl":"https://doi.org/10.1109/NAS.2017.8026857","url":null,"abstract":"Distributed data store is a fundamental building block for various Internet services. For large-scale distributed data store, the scalability and consistency of metadata services are prone to be the bottleneck. Various schemes are proposed to tackle the challenge of scalability and consistency within metadata services. While centralized single-node metadata services with low scalability provide low- overhead consistency maintenance, distributed metadata servers with high scalability often suffer complicated management and high-overhead consistency maintenance. As some key-value object-based storage systems locate and access an object by hashing function (e.g., consistent hashing table), there are no dedicated physical servers for metadata services. For key-value store without dedicated metadata servers, we exploited a scheme called virtual metadata servers (virtual MDS), which can create an opportunity to provide high performance and multi- level consistency. While conventional key-value data store distributes metadata across data nodes, our scheme uses proxy nodes, where virtual disks created, as virtual MDS to hold the metadata of virtual disks. Meanwhile, we also combine the characteristic of virtual disks and metadata services to implement a multi-level consistency strategy for the key-value object-based store without dedicated physical metadata servers. With virtual MDS, we use version information to update data asynchronously and check the version consistency periodically, then correct the stale entries properly. In this way, our virtual MDS can provide multi-level of consistency to cope with different read performance demand from users. The experiment results demonstrate that our scheme with relaxed consistency can enhance random write performance by 50% and improve random read performance by 16% compared with the standard storage system with strict consistency.","PeriodicalId":222161,"journal":{"name":"2017 International Conference on Networking, Architecture, and Storage (NAS)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116836556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lifeng Liu, Yue Zhang, Meilin Liu, Chong-Jun Wang, Jun Wang
{"title":"A-MapCG: An Adaptive MapReduce Framework for GPUs","authors":"Lifeng Liu, Yue Zhang, Meilin Liu, Chong-Jun Wang, Jun Wang","doi":"10.1109/NAS.2017.8026842","DOIUrl":"https://doi.org/10.1109/NAS.2017.8026842","url":null,"abstract":"The MapReduce framework proposed by Google to process large data sets is an efficient framework used in many areas, such as social network, scientific research, electronic business, etc. Hence, many MapReduce frameworks are proposed and implemented on different platforms. However, these MapReduce frameworks have limitations, and they cannot handle the collision problem in the map phase, and the unbalanced workload problem in the reduce phase. In this paper, an Adaptive MapReduce Framework (A-MapCG) is proposed based on the MapCG framework, to further improve the MapReduce performance on GPU platforms. Based on the experiments, we observed that for certain MapReduce applications emitting multiple Key/value (K/V) pairs for the same key, the atomic collision problem degrades the map phase performance of the MapReduce framework substantially. In addition, the workload unbalance problem wastes parallel computing resources and limits the overall reduction phase performance of the MapReduce framework on GPU platforms. A-MapCG uses segmentation table and intra-warp combination to reduce the number of collisions during the map phase. A-MapCG also adopts balanced workload assignment to improve the reduce phase performance. The proposed A-MapCG framework is evaluated on the Tesla K40 GPU hosted by Intel Core i7-4790. The case study shows that the map phase of A-MapCG achieves a speedup of 4.63 over MapCG for the test case, Word Count, with a 64MB workload. The average reduce phase speedup of A-MapCG over MapCG with parallel reductions of Word Count is 6.92. The average reduce phase speedup of A-MapCG over MapCG with serial reductions of Word Count is 4.11.","PeriodicalId":222161,"journal":{"name":"2017 International Conference on Networking, Architecture, and Storage (NAS)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127081279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A High-Performance Persistent Identifier Management Protocol","authors":"Fatih Berber, R. Yahyapour","doi":"10.1109/NAS.2017.8026839","DOIUrl":"https://doi.org/10.1109/NAS.2017.8026839","url":null,"abstract":"Persistent identifiers are well acknowledged for providing an abstraction for addresses of research datasets. However, due to the explosive growth of research datasets the view onto the concept of persistent identification moves towards a much more fundamental component for research data management. The ability of attaching semantic information into persistent identifier records, in principle enables the realization of a virtual global research data network by means of persistent identifiers. However, the increased importance of persistent identifiers has at the same time led to a steadily increasing load at persistent identifier systems. Therefore, the focus of this paper is to propose a high performance persistent identifier management protocol. In contrast to the DNS system, persistent identifier systems are usually subjected to bulky registration requests originating from individual research data repositories. Thus, the fundamental approach in this work is to implement a bulk registration operation into persistent identifier systems. Therefore, in this work we provide an extended version of the established Handle protocol equipped with a bulk registration operation. Moreover, we provide a specification and an efficient data model for such a bulk registration operation. Finally, by means of a comprehensive evaluation, we show the profound speedup achieved by our extended version of the Handle System. This is also highly relevant for various other persistent identifier systems, which are based on the Handle System.","PeriodicalId":222161,"journal":{"name":"2017 International Conference on Networking, Architecture, and Storage (NAS)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126878709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Megalloc: Fast Distributed Memory Allocator for NVM-Based Cluster","authors":"Songping Yu, Nong Xiao, Mingzhu Deng, Yuxuan Xing, Fang Liu, Wei Chen","doi":"10.1109/NAS.2017.8026865","DOIUrl":"https://doi.org/10.1109/NAS.2017.8026865","url":null,"abstract":"As the expected emerging Non-Volatile Memory (NVM) technologies, such as 3DXPoint, are in production, there has been a recent push in the big data processing community from storage-centric towards memory-centric. Generally, in large-scale systems, distributed memory management through traditional network with TCP/IP protocol exposes performance bottleneck. Briefly, CPU- centric network involves context switching, memory copy etc. Remote Direct Memory Access (RDMA) technology reveals the tremendous performance advantage over than TCP/IP: Allowing access to remote memory directly bypassing OS kernel. In this paper, we propose Megalloc, a distributed NVM allocator exposes NVMs as a shared address space of a cluster of machines based-on RDMA. Firstly, it makes memory allocation metadata accessed directly by each machine, allocating NVM in coarse-grained way; secondly, adopting fine-grained memory chunk for applications to read or store data; finally, it guarantees high distributed memory allocation performance.","PeriodicalId":222161,"journal":{"name":"2017 International Conference on Networking, Architecture, and Storage (NAS)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129875970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rack Level Scheduling for Containerized Workloads","authors":"Qiumin Xu, Krishna T. Malladi, M. Awasthi","doi":"10.1109/NAS.2017.8026873","DOIUrl":"https://doi.org/10.1109/NAS.2017.8026873","url":null,"abstract":"High performance SSDs have become ubiquitous in warehouse scale computing. Increased adoptions can be attributed to their high bandwidth, low latency and excellent random I/O performance. Owing to this high performance, multiple I/O intensive services can now be co-located on the same server. SSDs also introduce periodic latency spikes due to garbage collection. This, combined with multi-tenancy increases latency unpredictability since co-located applications now compete for CPU, memory, and disk bandwidth. The combination of these latency spikes and unpredictability lead to long tail latencies that can significantly decrease the system performance at scale. In this paper, we present a rack-level scheduling algorithm, which dynamically detects and shifts workloads with long tail latencies within servers in the same rack. Different from the global resource management methods, rack-level scheduling utilizes lightweight containers to minimize data movement and message passing overheads, leading to a much more efficient solution to reduce tail latency.With the algorithms implemented in the storage driver of the containerization infrastructure, it becomes viable to deploy and migrate applications in existing server racks without extensive modifications to storage, OS and other subsystems.","PeriodicalId":222161,"journal":{"name":"2017 International Conference on Networking, Architecture, and Storage (NAS)","volume":"216 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134091081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joo-Hyun Kim, Seo-Hee Hong, So-Hyun Yang, Jae-Hoon Kim
{"title":"Design of Universal Broker Architecture for Edge Networking","authors":"Joo-Hyun Kim, Seo-Hee Hong, So-Hyun Yang, Jae-Hoon Kim","doi":"10.1109/NAS.2017.8026852","DOIUrl":"https://doi.org/10.1109/NAS.2017.8026852","url":null,"abstract":"The number of network interfaces presented is soaring, and the Internet of Things (IoT) devices are now facing more complex networking operations. Extensive efforts are needed to combine these various network interfaces in expanding IoT environment. We suggest a design of Universal Broker architecture to integrate the diversified network interfaces. Any IoT applications could use the Universal Broker for transparent communications between different network technologies. We adopt MQTT for overall network message exchanges. We implement a testbed for three candidate network interfaces, Wi-Fi, LTE-M, and LoRa, working on a single broker. By proving the proper operations of a single broker for network heterogeneity, we expect an effective expansion of IoT applications for future edge networking environment.","PeriodicalId":222161,"journal":{"name":"2017 International Conference on Networking, Architecture, and Storage (NAS)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122542272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haoyu Wang, Jiaqi Gong, Yan Zhuang, Haiying Shen, J. Lach
{"title":"Healthedge: Task Scheduling for Edge Computing with Health Emergency and Human Behavior Consideration in Smart Homes","authors":"Haoyu Wang, Jiaqi Gong, Yan Zhuang, Haiying Shen, J. Lach","doi":"10.1109/NAS.2017.8026861","DOIUrl":"https://doi.org/10.1109/NAS.2017.8026861","url":null,"abstract":"Nowadays, a large amount of services are deployed on the edge of the network from the cloud since processing data at the edge can reduce response time and lower bandwidth cost for applications such as healthcare in smart homes. Resource management is very important in the edge computing since it is able to increase the system efficiency and improve the quality of service. A common approach for resource management in edge computing is to assign tasks to the remote cloud or edge devices just according to several factors such as energy, bandwidth consumption, and latency. However, the approach is insufficiently efficient and falls short in meeting the requirements of handling health emergency when being applied in smart homes for healthcare. Possible health emergency needs immediate attention and different health tasks have different priorities to be processed. In this paper, we propose a task scheduling approach called HealthEdge that sets different processing priorities for different tasks based on the collected data on human health status and determines whether a task should run in a local device or a remote cloud in order to reduce its total processing time as much as possible. Based on a real trace from five patients, we conduct a trace-driven experiment to evaluate the performance of HealthEdge in comparison with other methods. The results show that HealthEdge can optimally assign tasks between the network edge and cloud, which can reduce the task processing time, reduce bandwidth consumption and increase local edge workstation utilization.","PeriodicalId":222161,"journal":{"name":"2017 International Conference on Networking, Architecture, and Storage (NAS)","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121847612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ISM- An Intra-Stripe Data Migration Approach for RAID-5 Scaling","authors":"Jie Liang, Yinlong Xu, Yongkun Li, Yubiao Pan","doi":"10.1109/NAS.2017.8026863","DOIUrl":"https://doi.org/10.1109/NAS.2017.8026863","url":null,"abstract":"Scaling is often carried out in modern RAID systems to meet the ever increasing demand of storage capacity and I/O performance. However, the scaling process of RAID-5 system is not trivial, due to its specific data/parity layout. Previous approaches of RAID-5 scaling require either migrating almost all data blocks in the system, or recalculating all or some of the parity blocks during scaling. This paper proposes a new RAID-5 scaling approach called ISM (Intra-Stripe Migration). With ISM, data migrations only happen within stripes, which means that the coding relationship among blocks remains the same. Therefore, the parity blocks do not need to be recalculated after data migration, which greatly reduces the I/O and computational costs. The properties of ISM approach can be summarized as follows: (1) it requires the minimum amount of data blocks to be migrated, (2) it supports data migration without recalculating parity blocks, and (3) it supports multiple successive scaling operations while keeping the above properties. The simulation results on DiskSim show that: (1) ISM reduces the scaling time from 47.91% to 87.01% and from 88.94% to 96.58% compared with GSR and ALV respectively in offline scaling, (2) under two real-world I/O traces, ISM also outperforms GSR by 64.15% to 87.30%, and ALV by 92.38% to 95.98% in scaling time, and (3) ISM maintains almost the same performance of data access with ALV and GSR after scaling.","PeriodicalId":222161,"journal":{"name":"2017 International Conference on Networking, Architecture, and Storage (NAS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123216718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sandeep Madireddy, Prasanna Balaprakash, P. Carns, R. Latham, R. Ross, S. Snyder, Stefan M. Wild
{"title":"Analysis and Correlation of Application I/O Performance and System-Wide I/O Activity","authors":"Sandeep Madireddy, Prasanna Balaprakash, P. Carns, R. Latham, R. Ross, S. Snyder, Stefan M. Wild","doi":"10.1109/NAS.2017.8026844","DOIUrl":"https://doi.org/10.1109/NAS.2017.8026844","url":null,"abstract":"Storage resources in high-performance computing are shared across all user applications. Consequently, storage performance can vary markedly, depending not only on an application's workload but also on what other activity is concurrently running across the system. This variability in storage performance is directly reflected in overall execution time variability, thus confounding efforts to predict job performance for scheduling or capacity planning. I/O variability also complicates the seemingly straightforward process of performance measurement when evaluating application optimizations. In this work we present a methodology to measure I/O contention with more rigor than in prior work. We apply statistical techniques to gain insight from application-level statistics and storage-side logging. We examine different correlation metrics for relating system workload to job I/O performance and identify an effective and generally applicable metric for measuring job I/O performance. We further demonstrate that the system-wide monitoring granularity can directly affect the strength of correlation observed. Insufficient granularity and measurements can hide the correlations between application I/O performance and system-wide I/O activity.","PeriodicalId":222161,"journal":{"name":"2017 International Conference on Networking, Architecture, and Storage (NAS)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128737930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}