Proceedings of the First Workshop on AI for Systems最新文献

筛选
英文 中文
Anomaly Detection in Scientific Datasets using Sparse Representation 基于稀疏表示的科学数据集异常检测
Proceedings of the First Workshop on AI for Systems Pub Date : 2023-08-10 DOI: 10.1145/3588982.3603610
Aekyeung Moon, Minjun Kim, Jiaxi Chen, S. Son
{"title":"Anomaly Detection in Scientific Datasets using Sparse Representation","authors":"Aekyeung Moon, Minjun Kim, Jiaxi Chen, S. Son","doi":"10.1145/3588982.3603610","DOIUrl":"https://doi.org/10.1145/3588982.3603610","url":null,"abstract":"As the size and complexity of high-performance computing (HPC) systems keep growing, scientists' ability to trust the data produced is paramount due to potential data corruption for various reasons, which may stay undetected. While employing machine learning-based anomaly detection techniques could relieve scientists of such concern, it is practically infeasible due to the need for labels for volumes of scientific datasets and the unwanted extra overhead associated. In this paper, we exploit spatial sparsity profiles exhibited in scientific datasets and propose an approach to detect anomalies effectively. Our method first extracts block-level sparse representations of original datasets in the transformed domain. Then it learns from the extracted sparse representations and builds the boundary threshold between normal and abnormal without relying on labeled data. Experiments using real-world scientific datasets show that the proposed approach requires 13% on average (less than 10% in most cases and as low as 0.3%) of the entire dataset to achieve competitive detection accuracy (70.74%-100.0%) as compared to two state-of-the-art unsupervised techniques.","PeriodicalId":432974,"journal":{"name":"Proceedings of the First Workshop on AI for Systems","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129749033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Practical Machine Learning Frameworks for Performance Diagnostics in Supercomputers 面向超级计算机性能诊断的实用机器学习框架
Proceedings of the First Workshop on AI for Systems Pub Date : 2023-08-10 DOI: 10.1145/3588982.3603609
Burak Aksar, Efe Sencan, B. Schwaller, V. Leung, Jim Brandt, B. Kulis, Manuel Egele, A. Coskun
{"title":"Towards Practical Machine Learning Frameworks for Performance Diagnostics in Supercomputers","authors":"Burak Aksar, Efe Sencan, B. Schwaller, V. Leung, Jim Brandt, B. Kulis, Manuel Egele, A. Coskun","doi":"10.1145/3588982.3603609","DOIUrl":"https://doi.org/10.1145/3588982.3603609","url":null,"abstract":"Supercomputers are highly sophisticated computing systems designed to handle complex and computationally intensive tasks. Despite their tremendous efficiency, performance problems still arise due to various factors, such as load imbalance, network congestion, and software-related issues. Monitoring frameworks are commonly used to collect telemetry data, which helps identify potential issues before they become critical or debug problems. However, telemetry analytics is essentially a big data problem that is becoming increasingly difficult to manage due to terabytes of telemetry data collected daily. Owing to the limitations of manual analysis, recent analytics frameworks leverage automated machine learning (ML)-based frameworks to identify patterns and anomalies in this data, enabling system administrators and users to take appropriate action towards resolving performance problems quickly. This paper explores the benefits and challenges of ML-based frameworks that automate performance diagnostics, particularly focusing on labeled training data requirements and deployment challenges. We argue that ML-based frameworks can achieve desirable performance diagnosis results while reducing the need for large labeled data sets, and we demonstrate successful prototypes that are suitable for rapid deployment on real-world systems.","PeriodicalId":432974,"journal":{"name":"Proceedings of the First Workshop on AI for Systems","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125756576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Streaming Machine Learning for Supporting Data Prefetching in Modern Data Storage Systems 现代数据存储系统中支持数据预取的流机器学习
Proceedings of the First Workshop on AI for Systems Pub Date : 2023-08-10 DOI: 10.1145/3588982.3603608
Edson Ramiro Lucas Filho, Lun Yang, Kebo Fu, H. Herodotou
{"title":"Streaming Machine Learning for Supporting Data Prefetching in Modern Data Storage Systems","authors":"Edson Ramiro Lucas Filho, Lun Yang, Kebo Fu, H. Herodotou","doi":"10.1145/3588982.3603608","DOIUrl":"https://doi.org/10.1145/3588982.3603608","url":null,"abstract":"Modern data storage systems optimize data access by distributing data across multiple storage tiers and caches, based on numerous tiering and caching policies. The policies' decisions, and in particular the ones related to data prefetching, can severely impact the performance of the entire storage system. In recent years, various machine learning algorithms have been employed to model access patterns in complex data storage workloads. Even though data storage systems handle a constantly changing stream of file requests, current approaches continue to train their models offline in a batch-based approach. In this paper, we investigate the use of streaming machine learning to support data prefetching decisions in data storage systems as it introduces various advantages such as high training efficiency, high prediction accuracy, and high adaptability to changing workload patterns. After extracting a representative set of features in an online fashion, streaming machine learning models can be trained and tested while the system is running. To validate our methodology, we present one streaming classification model to predict the next file offset to be read in a file. We assess the model's performance using production traces provided by Huawei Technologies and demonstrate that streaming machine learning is a feasible approach with low memory consumption and minimal training delay, facilitating accurate predictions in real-time.","PeriodicalId":432974,"journal":{"name":"Proceedings of the First Workshop on AI for Systems","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131033095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Proceedings of the First Workshop on AI for Systems 第一届系统人工智能研讨会论文集
Proceedings of the First Workshop on AI for Systems Pub Date : 1900-01-01 DOI: 10.1145/3588982
{"title":"Proceedings of the First Workshop on AI for Systems","authors":"","doi":"10.1145/3588982","DOIUrl":"https://doi.org/10.1145/3588982","url":null,"abstract":"","PeriodicalId":432974,"journal":{"name":"Proceedings of the First Workshop on AI for Systems","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122328038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信