机器学习工作负载中记忆参考行为的分类与表征

2022 IEEE/ACIS 23rd International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD) Pub Date : 2022-12-07 DOI:10.1109/SNPD54884.2022.10051800

Seok-Kyung Kwon, H. Bahn

{"title":"机器学习工作负载中记忆参考行为的分类与表征","authors":"Seok-Kyung Kwon, H. Bahn","doi":"10.1109/SNPD54884.2022.10051800","DOIUrl":null,"url":null,"abstract":"With the recent penetration of artificial intelligence (AI) technologies into many areas of computing, machine learning is being incorporated into modern software design. As the in-memory data of AI workloads increasingly grows, it is important to characterize memory reference behaviors in machine learning workloads. In this paper, we perform a characterization study for memory references in machine learning workloads as the learning types (i.e., supervised vs. unsupervised) and the problem domains (i.e., classification, regression, and clustering) are varied. From this study, we uncover the following five characteristics. First, machine learning workloads exhibit significantly different memory reference patterns from traditional workloads, but they are similar regardless of learning types and problem domains. Second, in all workloads, memory reads and writes continue to appear for a wide range of memory addresses, but there is a specific time period where only reads appear. Third, among references to memory areas (i.e., code, data, heap, stack, library), library accounts for about 90% of total memory references. Fourth, there is a low popularity bias between memory pages referenced in machine learning workloads, especially for writes. Fifth, when estimating the likelihood of re-referencing, temporal locality is dominant in top 100 memory pages, but access frequency provides better information after that ranking. It is expected that the characterization of memory references conducted in this paper will be helpful in the design of memory management policies for machine learning workloads.","PeriodicalId":425462,"journal":{"name":"2022 IEEE/ACIS 23rd International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)","volume":"91 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Classification and Characterization of Memory Reference Behavior in Machine Learning Workloads\",\"authors\":\"Seok-Kyung Kwon, H. Bahn\",\"doi\":\"10.1109/SNPD54884.2022.10051800\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the recent penetration of artificial intelligence (AI) technologies into many areas of computing, machine learning is being incorporated into modern software design. As the in-memory data of AI workloads increasingly grows, it is important to characterize memory reference behaviors in machine learning workloads. In this paper, we perform a characterization study for memory references in machine learning workloads as the learning types (i.e., supervised vs. unsupervised) and the problem domains (i.e., classification, regression, and clustering) are varied. From this study, we uncover the following five characteristics. First, machine learning workloads exhibit significantly different memory reference patterns from traditional workloads, but they are similar regardless of learning types and problem domains. Second, in all workloads, memory reads and writes continue to appear for a wide range of memory addresses, but there is a specific time period where only reads appear. Third, among references to memory areas (i.e., code, data, heap, stack, library), library accounts for about 90% of total memory references. Fourth, there is a low popularity bias between memory pages referenced in machine learning workloads, especially for writes. Fifth, when estimating the likelihood of re-referencing, temporal locality is dominant in top 100 memory pages, but access frequency provides better information after that ranking. It is expected that the characterization of memory references conducted in this paper will be helpful in the design of memory management policies for machine learning workloads.\",\"PeriodicalId\":425462,\"journal\":{\"name\":\"2022 IEEE/ACIS 23rd International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)\",\"volume\":\"91 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/ACIS 23rd International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SNPD54884.2022.10051800\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACIS 23rd International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SNPD54884.2022.10051800","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

随着人工智能(AI)技术最近渗透到许多计算领域，机器学习正在被纳入现代软件设计。随着人工智能工作负载的内存数据日益增长，表征机器学习工作负载中的内存引用行为变得非常重要。在本文中，我们对机器学习工作负载中的内存引用进行了表征研究，因为学习类型(即监督与无监督)和问题域(即分类，回归和聚类)是不同的。从这项研究中，我们发现了以下五个特征。首先，机器学习工作负载表现出与传统工作负载明显不同的内存引用模式，但无论学习类型和问题领域如何，它们都是相似的。其次，在所有工作负载中，内存读和写继续出现在很大范围的内存地址中，但是有一个特定的时间段只出现读。第三，在对内存区域(即代码、数据、堆、堆栈、库)的引用中，库约占总内存引用的90%。第四，机器学习工作负载中引用的内存页面之间的流行度偏差很低，特别是对于写操作。第五，在估计重新引用的可能性时，时间局部性在前100个内存页面中占主导地位，但访问频率在该排名之后提供了更好的信息。预计本文对内存引用的表征将有助于机器学习工作负载的内存管理策略的设计。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Classification and Characterization of Memory Reference Behavior in Machine Learning Workloads

With the recent penetration of artificial intelligence (AI) technologies into many areas of computing, machine learning is being incorporated into modern software design. As the in-memory data of AI workloads increasingly grows, it is important to characterize memory reference behaviors in machine learning workloads. In this paper, we perform a characterization study for memory references in machine learning workloads as the learning types (i.e., supervised vs. unsupervised) and the problem domains (i.e., classification, regression, and clustering) are varied. From this study, we uncover the following five characteristics. First, machine learning workloads exhibit significantly different memory reference patterns from traditional workloads, but they are similar regardless of learning types and problem domains. Second, in all workloads, memory reads and writes continue to appear for a wide range of memory addresses, but there is a specific time period where only reads appear. Third, among references to memory areas (i.e., code, data, heap, stack, library), library accounts for about 90% of total memory references. Fourth, there is a low popularity bias between memory pages referenced in machine learning workloads, especially for writes. Fifth, when estimating the likelihood of re-referencing, temporal locality is dominant in top 100 memory pages, but access frequency provides better information after that ranking. It is expected that the characterization of memory references conducted in this paper will be helpful in the design of memory management policies for machine learning workloads.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE/ACIS 23rd International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)

自引率

0.00%

发文量