Operating Systems Review (ACM)最新文献

Using Local Cache Coherence for Disaggregated Memory Systems 在分解存储系统中使用本地缓存一致性

Operating Systems Review (ACM) Pub Date : 2023-06-26 DOI: 10.1145/3606557.3606561

I. Calciu, M. Imran, Ivan Puddu, Sanidhya Kashyap, H. Maruf, O. Mutlu, Aasheesh Kolli

{"title":"Using Local Cache Coherence for Disaggregated Memory Systems","authors":"I. Calciu, M. Imran, Ivan Puddu, Sanidhya Kashyap, H. Maruf, O. Mutlu, Aasheesh Kolli","doi":"10.1145/3606557.3606561","DOIUrl":"https://doi.org/10.1145/3606557.3606561","url":null,"abstract":"Disaggregated memory provides many cost savings and resource provisioning benefits for current datacenters, but software systems enabling disaggregated memory access result in high performance penalties. These systems require intrusive code changes to port applications for disaggregated memory or employ slow virtual memory mechanisms to avoid code changes. Such mechanisms result in high overhead page faults to access remote data and high dirty data amplification when tracking changes to cached data at page-granularity. In this paper, we propose a fundamentally new approach for disaggregated memory systems, based on the observation that we can use local cache coherence to track applications' memory accesses transparently, without code changes, at cache-line granularity. This simple idea (1) eliminates page faults from the application critical path when accessing remote data, and (2) decouples the application memory access tracking from the virtual memory page size, enabling cache-line granularity dirty data tracking and eviction. Using this observation, we implemented a new software runtime for disaggregated memory that improves average memory access time and reduces dirty data amplification1.","PeriodicalId":38935,"journal":{"name":"Operating Systems Review (ACM)","volume":"57 1","pages":"21 - 28"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45832972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Make It Real: An End-to-End Implementation of A Physically Disaggregated Data Center 实现：物理分解数据中心的端到端实现

Operating Systems Review (ACM) Pub Date : 2023-06-26 DOI: 10.1145/3606557.3606559

Yiying Zhang

引用次数: 0

Memory disaggregation: why now and what are the challenges 记忆分解:为什么是现在?挑战是什么

Operating Systems Review (ACM) Pub Date : 2023-06-26 DOI: 10.1145/3606557.3606563

M. Aguilera, Emmanuel Amaro, Nadav Amit, Erika Hunhoff, Anil Yelam, Gerd Zellweger

引用次数: 1

Navigating Performance-Efficiency Tradeoffs in Serverless Computing: Deduplication to the Rescue! 在无服务器计算中进行性能效率权衡：重复数据消除助一臂之力！

Operating Systems Review (ACM) Pub Date : 2023-06-26 DOI: 10.1145/3606557.3606564

Divyanshu Saxena, T. Ji, Arjun Singhvi, Junaid Khalid, Aditya Akella

引用次数: 0

Disaggregated GPU Acceleration for Serverless Applications 用于无服务器应用程序的分解GPU加速

Operating Systems Review (ACM) Pub Date : 2023-06-26 DOI: 10.1145/3606557.3606560

Henrique Fingler, Zhiting Zhu, Esther Yoon, Zhipeng Jia, E. Witchel, C. Rossbach

引用次数: 0

Memory Disaggregation: Advances and Open Challenges 记忆分解：进展与开放的挑战

Operating Systems Review (ACM) Pub Date : 2023-05-06 DOI: 10.1145/3606557.3606562

H. Maruf, Mosharaf Chowdhury

引用次数: 0

Positional Paper 定位纸

Operating Systems Review (ACM) Pub Date : 2022-06-14 DOI: 10.1145/3544497.3544500

Y. Shkuro, B. Renard, Ashutosh Kumar Singh

引用次数: 0

Data-Aware Compression for HPC using Machine Learning 使用机器学习的HPC数据感知压缩

Operating Systems Review (ACM) Pub Date : 2022-06-14 DOI: 10.1145/3544497.3544508

Julius Plehn, A. Fuchs, Michael Kuhn, Jakob Lüttgau, T. Ludwig

引用次数: 0

Analysis and Workload Characterization of the CERN EOS Storage System CERN EOS存储系统的分析与工作负载表征

Operating Systems Review (ACM) Pub Date : 2022-06-14 DOI: 10.1145/3544497.3544507

Devashish R. Purandare, Daniel Bittman, E. L. Miller

引用次数: 0

An Intelligent Framework for Timely, Accurate, and Comprehensive Cloud Incident Detection 用于及时、准确和全面的云事件检测的智能框架

Operating Systems Review (ACM) Pub Date : 2022-06-14 DOI: 10.1145/3544497.3544499

Yichen Li, Xu Zhang, Shilin He, Zhuangbin Chen, Yu Kang, Jinyang Liu, Liqun Li, Yingnong Dang, Feng Gao, Zhangwei Xu, S. Rajmohan, Qingwei Lin, Dongmei Zhang, Michael R. Lyu

{"title":"An Intelligent Framework for Timely, Accurate, and Comprehensive Cloud Incident Detection","authors":"Yichen Li, Xu Zhang, Shilin He, Zhuangbin Chen, Yu Kang, Jinyang Liu, Liqun Li, Yingnong Dang, Feng Gao, Zhangwei Xu, S. Rajmohan, Qingwei Lin, Dongmei Zhang, Michael R. Lyu","doi":"10.1145/3544497.3544499","DOIUrl":"https://doi.org/10.1145/3544497.3544499","url":null,"abstract":"Cloud incidents (service interruptions or performance degradation) dramatically degrade the reliability of large-scale cloud systems, causing customer dissatisfaction and revenue loss. With years of efforts, cloud providers are able to solve most incidents automatically and rapidly. The secret of this ability is intelligent incident detection. Only when incidents are detected timely, accurately, and comprehensively, can they be diagnosed and mitigated at a satisfiable speed. To overcome the limitations of traditional rule-based detection, we carried out years of incident detection research. We developed a comprehensive AIOps (Artificial Intelligence for IT Operations) framework for incident detection containing a set of data-driven methods. This paper shares our recent experience of developing and deploying such an intelligent incident detection system at Microsoft. We first discuss the real-world challenges of incident detection that constitute the pain points of engineers. Then, we summarize our intelligent solutions proposed in recent years to tackle these challenges. Finally, we show the deployment of the incident detection AIOps framework and demonstrate its practical benefits conveyed to Microsoft cloud services with real cases.","PeriodicalId":38935,"journal":{"name":"Operating Systems Review (ACM)","volume":"56 1","pages":"1 - 7"},"PeriodicalIF":0.0,"publicationDate":"2022-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45597721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5