2021 IEEE International Conference on Networking, Architecture and Storage (NAS)最新文献

筛选
英文 中文
NVSwap: Latency-Aware Paging using Non-Volatile Main Memory NVSwap:使用非易失性主存的延迟感知分页
2021 IEEE International Conference on Networking, Architecture and Storage (NAS) Pub Date : 2021-10-01 DOI: 10.1109/nas51552.2021.9605418
Yekang Wu, Xuechen Zhang
{"title":"NVSwap: Latency-Aware Paging using Non-Volatile Main Memory","authors":"Yekang Wu, Xuechen Zhang","doi":"10.1109/nas51552.2021.9605418","DOIUrl":"https://doi.org/10.1109/nas51552.2021.9605418","url":null,"abstract":"Page relocation (paging) from DRAM to swap devices is an important task of a virtual memory system in operating systems. Existing Linux paging mechanisms have two main deficiencies: (1) they may incur a high I/O latency due to write interference on solid-state disks and aggressive memory page reclaiming rate under high memory pressure and (2) they do not provide predictable latency bound for latency-sensitive applications because they cannot control the allocation of system resources among concurrent processes sharing swap devices.In this paper, we present the design and implementation of a latency-aware paging mechanism called NVSwap. It supports a hybrid swap space using both regular secondary storage devices (e.g., solid-state disks) and non-volatile main memory (NVMM). The design is more cost-effective than using only NVMM as swap spaces. Furthermore, NVSwap uses NVMM as a persistent paging buffer to serve the page-out requests and hide the latency of paging between the regular swap device and DRAM. It supports in-situ paging for pages in the persistent paging buffer avoiding the slow I/O path. Finally, NVSwap allows users to specify latency bounds for individual processes or a group of related processes and enforces the bounds by dynamically controlling the resource allocation of NVMM and page reclaiming rate in memory among scheduling units. We have implemented a prototype of NVSwap in the Linux kernel-4.4.241 based on Intel Optane DIMMs. Our results demonstrate that NVSwap reduces paging latency by up to 99% and provides performance guarantee and isolation among concurrent applications sharing swap devices.","PeriodicalId":135930,"journal":{"name":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":"160 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114173714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
[Copyright notice] (版权)
2021 IEEE International Conference on Networking, Architecture and Storage (NAS) Pub Date : 2021-10-01 DOI: 10.1109/nas51552.2021.9605439
{"title":"[Copyright notice]","authors":"","doi":"10.1109/nas51552.2021.9605439","DOIUrl":"https://doi.org/10.1109/nas51552.2021.9605439","url":null,"abstract":"","PeriodicalId":135930,"journal":{"name":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128911315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Characterizing AI Model Inference Applications Running in the SGX Environment 描述在SGX环境中运行的AI模型推理应用程序
2021 IEEE International Conference on Networking, Architecture and Storage (NAS) Pub Date : 2021-10-01 DOI: 10.1109/nas51552.2021.9605445
Shixiong Jing, Qinkun Bao, Pei Wang, Xulong Tang, Dinghao Wu
{"title":"Characterizing AI Model Inference Applications Running in the SGX Environment","authors":"Shixiong Jing, Qinkun Bao, Pei Wang, Xulong Tang, Dinghao Wu","doi":"10.1109/nas51552.2021.9605445","DOIUrl":"https://doi.org/10.1109/nas51552.2021.9605445","url":null,"abstract":"Intel Software Guard Extensions (SGX) is a set of extensions built into Intel CPUs for the trusted computation. It creates a hardware-assisted secure container, within which programs are protected from data leakage and data manipulations by privileged software and hypervisors. With the trend that more and more machine learning based programs are moving to cloud computing, SGX can be used in cloud-based Machine Learning applications to protect user data from malicious privileged programs.However, applications running in SGX suffer from several overheads, including frequent context switching, memory page encryption/decryption, and memory page swapping, which significantly degrade the execution efficiency. In this paper, we aim to i) comprehensively explore the execution of general AI applications running on SGX, ii) systematically characterize the data reuses at both page granularity and cacheline granularity, and iii) provide optimization insights for efficient deployment of machine learning based applications on SGX. To the best of our knowledge, our work is the first to study machine learning applications on SGX and explore the potential of data reuses to reduce the runtime overheads in SGX.","PeriodicalId":135930,"journal":{"name":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":"133 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115171338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Decoupling Control and Data Transmission in RDMA Enabled Cloud Data Centers 支持RDMA的云数据中心中的解耦控制与数据传输
2021 IEEE International Conference on Networking, Architecture and Storage (NAS) Pub Date : 2021-10-01 DOI: 10.1109/nas51552.2021.9605415
Qingyue Liu, P. Varman
{"title":"Decoupling Control and Data Transmission in RDMA Enabled Cloud Data Centers","authors":"Qingyue Liu, P. Varman","doi":"10.1109/nas51552.2021.9605415","DOIUrl":"https://doi.org/10.1109/nas51552.2021.9605415","url":null,"abstract":"Advances in storage, processing, and networking hardware are changing the structure of distributed applications. RDMA networks provide multiple communication mechanisms that enable novel hybrid protocols specialized to different data transfer requirements. In this paper, we present a distributed communication scheme that separates control and data communication channels directly at the RNIC rather than the application level. We develop a new communication artifact, a remote random access buffer, to efficiently implement this separation. Data messages are sent silently to the receiver, which is informed of the location of the data by a subsequent control message. Experiments on an RDMA-enabled cluster with micro benchmarks and two distributed applications validate the performance benefits of our approach.","PeriodicalId":135930,"journal":{"name":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127909522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
E2E Visual Analytics: Achieving >10X Edge/Cloud Optimizations E2E可视化分析:实现>10倍的边缘/云优化
2021 IEEE International Conference on Networking, Architecture and Storage (NAS) Pub Date : 2021-10-01 DOI: 10.1109/nas51552.2021.9605404
Chaunté W. Lacewell, Nilesh A. Ahuja, Pablo Muñoz, Parual Datta, Ragaad Altarawneh, Vui Seng Chua, Nilesh Jain, Omesh Tickoo, R. Iyer
{"title":"E2E Visual Analytics: Achieving >10X Edge/Cloud Optimizations","authors":"Chaunté W. Lacewell, Nilesh A. Ahuja, Pablo Muñoz, Parual Datta, Ragaad Altarawneh, Vui Seng Chua, Nilesh Jain, Omesh Tickoo, R. Iyer","doi":"10.1109/nas51552.2021.9605404","DOIUrl":"https://doi.org/10.1109/nas51552.2021.9605404","url":null,"abstract":"As visual analytics continues to rapidly grow, there is a critical need to improve the end-to-end efficiency of visual processing in edge/cloud systems. In this paper, we cover algorithms, systems and optimizations in three major areas for edge/cloud visual processing: (1) addressing storage and retrieval efficiency of visual data and meta-data by employing and optimizing visual data management systems, (2) addressing compute efficiency of visual analytics by taking advantage of co-optimization between the compression and analytics domains and (3) addressing networking (bandwidth) efficiency of visual data compression by tailoring it based on analytics tasks. We describe techniques in each of the above areas and measure its efficacy on state-of-the-art platforms (Intel Xeon), workloads and datasets. Our results show that we can achieve >10X improvements in each area based on novel algorithms, systems, and co-design optimizations. We also outline future research directions based on our findings which outline areas of further performance and efficiency advantages in end-to-end visual analytics.","PeriodicalId":135930,"journal":{"name":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":"26 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128226692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On Adapting the Cache Block Size in SSD Caches 关于调整SSD缓存块大小的研究
2021 IEEE International Conference on Networking, Architecture and Storage (NAS) Pub Date : 2021-10-01 DOI: 10.1109/nas51552.2021.9605462
Nikolaus Jeremic, Helge Parzyjegla, Gero Mühl
{"title":"On Adapting the Cache Block Size in SSD Caches","authors":"Nikolaus Jeremic, Helge Parzyjegla, Gero Mühl","doi":"10.1109/nas51552.2021.9605462","DOIUrl":"https://doi.org/10.1109/nas51552.2021.9605462","url":null,"abstract":"SSD-based block-level caches can notably increase the performance of HDD-based storage systems. However, this demands a sensible choice of the cache block size, which depends strongly on the workload characteristics. Many workloads will most likely favor either small or large cache blocks. Unfortunately, choosing the appropriate cache block size is difficult due to the diversity and dynamics of storage workloads. Thus, adapting the cache block size to the workload characteristics at run time has the potential to substantially improve the cache performance compared to using a fixed cache block size. However, changing the used cache block size for all cached data is very costly and neglects that distinct parts of the data may exhibit different access patterns, which favor distinct cache block sizes.In this paper, we experimentally study the performance impact of the cache block size and fine-grained adaptation, i.e., for individual parts of the data, between small and large cache blocks in write-back SSD caches. Based on our results, we make two major observations on the performance impact of the cache block size and its adaptation. First, using an inappropriate cache block size can reduce the overall throughput by up to 84% compared to using the most suitable cache block size. Second, fine-grained adaptation between small and large cache blocks is highly beneficial as it avoids such a performance deterioration, whereas it can increase the overall throughput by up to 126% in comparison to using the more suitable fixed cache block size.","PeriodicalId":135930,"journal":{"name":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131060944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Exploring Storage Device Characteristics of A RISC-V Little-core SoC RISC-V小核SoC的存储器件特性研究
2021 IEEE International Conference on Networking, Architecture and Storage (NAS) Pub Date : 2021-10-01 DOI: 10.1109/nas51552.2021.9605430
Tao Lu
{"title":"Exploring Storage Device Characteristics of A RISC-V Little-core SoC","authors":"Tao Lu","doi":"10.1109/nas51552.2021.9605430","DOIUrl":"https://doi.org/10.1109/nas51552.2021.9605430","url":null,"abstract":"Low-power system-on-chips (SoCs) dominate the Internet of Things (IoT) ecosystem, which consists of billions of devices that can generate Zettabytes of data. SoC directly interacts with big data, but there is little research on its storage performance and power consumption characteristics, especially the lack of quantitative evaluation. In this paper, we study the storage characteristics of a low-power RISC-V SoC FPGA. Specifically, we deploy a PCIe SSD to study the performance of storage devices under little cores. We quantitatively evaluate device bandwidth, IOPS throughput, and power consumption. In addition, we compare the same device on the low-power RISC-V SoC and a high-performance x86 server to observe the similarities and differences of the storage device behavior on different computing platforms.","PeriodicalId":135930,"journal":{"name":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131409951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Balancing Latency and Quality in Web Search 平衡网络搜索的延迟和质量
2021 IEEE International Conference on Networking, Architecture and Storage (NAS) Pub Date : 2021-10-01 DOI: 10.1109/nas51552.2021.9605375
Liang Zhou, K. Ramakrishnan
{"title":"Balancing Latency and Quality in Web Search","authors":"Liang Zhou, K. Ramakrishnan","doi":"10.1109/nas51552.2021.9605375","DOIUrl":"https://doi.org/10.1109/nas51552.2021.9605375","url":null,"abstract":"Selecting the right time budget for a search query is challenging because a proper balance between the search latency, quality and efficiency has to be maintained. State-of-the-art approaches leverage a centralized sample index at the aggregator to select the Index Serving Nodes (ISNs) to maintain quality and responsiveness. In this paper, we propose Cottage, a coordinated framework between the aggregator and ISNs for latency and quality optimization in web search. Cottage has two separate neural network models at each ISN to predict the quality contribution and latency, respectively. Then, these prediction results are sent back to the aggregator for latency and quality optimizations. The key task is integration of the predictions at the aggregator in determining an optimal dynamic time budget for identifying slow and low quality ISNs to improve latency and search efficiency. Our experiments on the Solr search engine prove that Cottage can reduce the average query latency by 54% and achieve a good P@10 search quality of 0.947.","PeriodicalId":135930,"journal":{"name":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120894459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Flow Scheduling in a Heterogeneous NFV Environment using Reinforcement Learning 基于强化学习的异构NFV环境流调度
2021 IEEE International Conference on Networking, Architecture and Storage (NAS) Pub Date : 2021-10-01 DOI: 10.1109/nas51552.2021.9605395
Chun Jen Lin, Yan Luo, Liang-Min Wang, Li-De Chen
{"title":"Flow Scheduling in a Heterogeneous NFV Environment using Reinforcement Learning","authors":"Chun Jen Lin, Yan Luo, Liang-Min Wang, Li-De Chen","doi":"10.1109/nas51552.2021.9605395","DOIUrl":"https://doi.org/10.1109/nas51552.2021.9605395","url":null,"abstract":"Network function virtualization (NFV) allows net-work functions executed on general-purpose servers or virtual machines (VMs) instead of proprietary hardware, greatly improving the flexibility and scalability of network services. Recent trends in using programmable accelerators to speed up NFV performance introduce challenges in flow scheduling in a dynamic NFV environment. Reinforcement learning (RL) trains machine learning models for decision making to maximize returns in uncertain environments such as NFV. In this paper, we study the allocation of heterogeneous processors (CPUs and FPGAs) to minimize the delays of flows in the system. We conduct extensive simulations to evaluate the performance of reinforcement learning based scheduling algorithms such as Advantage Actor Critic (A2C), Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO), and compare with greedy policies. The results show that RL based schedulers can effectively learn from past experiences and converge to the optimal greedy policy. We also analyze in-depth how the policies lead to different processor utilization and flow processing time, and provide insights into these policies.","PeriodicalId":135930,"journal":{"name":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127829280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信