{"title":"机器学习工作负载中记忆参考行为的分类与表征","authors":"Seok-Kyung Kwon, H. Bahn","doi":"10.1109/SNPD54884.2022.10051800","DOIUrl":null,"url":null,"abstract":"With the recent penetration of artificial intelligence (AI) technologies into many areas of computing, machine learning is being incorporated into modern software design. As the in-memory data of AI workloads increasingly grows, it is important to characterize memory reference behaviors in machine learning workloads. In this paper, we perform a characterization study for memory references in machine learning workloads as the learning types (i.e., supervised vs. unsupervised) and the problem domains (i.e., classification, regression, and clustering) are varied. From this study, we uncover the following five characteristics. First, machine learning workloads exhibit significantly different memory reference patterns from traditional workloads, but they are similar regardless of learning types and problem domains. Second, in all workloads, memory reads and writes continue to appear for a wide range of memory addresses, but there is a specific time period where only reads appear. Third, among references to memory areas (i.e., code, data, heap, stack, library), library accounts for about 90% of total memory references. Fourth, there is a low popularity bias between memory pages referenced in machine learning workloads, especially for writes. Fifth, when estimating the likelihood of re-referencing, temporal locality is dominant in top 100 memory pages, but access frequency provides better information after that ranking. It is expected that the characterization of memory references conducted in this paper will be helpful in the design of memory management policies for machine learning workloads.","PeriodicalId":425462,"journal":{"name":"2022 IEEE/ACIS 23rd International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)","volume":"91 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Classification and Characterization of Memory Reference Behavior in Machine Learning Workloads\",\"authors\":\"Seok-Kyung Kwon, H. Bahn\",\"doi\":\"10.1109/SNPD54884.2022.10051800\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the recent penetration of artificial intelligence (AI) technologies into many areas of computing, machine learning is being incorporated into modern software design. As the in-memory data of AI workloads increasingly grows, it is important to characterize memory reference behaviors in machine learning workloads. In this paper, we perform a characterization study for memory references in machine learning workloads as the learning types (i.e., supervised vs. unsupervised) and the problem domains (i.e., classification, regression, and clustering) are varied. From this study, we uncover the following five characteristics. First, machine learning workloads exhibit significantly different memory reference patterns from traditional workloads, but they are similar regardless of learning types and problem domains. Second, in all workloads, memory reads and writes continue to appear for a wide range of memory addresses, but there is a specific time period where only reads appear. Third, among references to memory areas (i.e., code, data, heap, stack, library), library accounts for about 90% of total memory references. Fourth, there is a low popularity bias between memory pages referenced in machine learning workloads, especially for writes. Fifth, when estimating the likelihood of re-referencing, temporal locality is dominant in top 100 memory pages, but access frequency provides better information after that ranking. It is expected that the characterization of memory references conducted in this paper will be helpful in the design of memory management policies for machine learning workloads.\",\"PeriodicalId\":425462,\"journal\":{\"name\":\"2022 IEEE/ACIS 23rd International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)\",\"volume\":\"91 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/ACIS 23rd International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SNPD54884.2022.10051800\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACIS 23rd International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SNPD54884.2022.10051800","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Classification and Characterization of Memory Reference Behavior in Machine Learning Workloads
With the recent penetration of artificial intelligence (AI) technologies into many areas of computing, machine learning is being incorporated into modern software design. As the in-memory data of AI workloads increasingly grows, it is important to characterize memory reference behaviors in machine learning workloads. In this paper, we perform a characterization study for memory references in machine learning workloads as the learning types (i.e., supervised vs. unsupervised) and the problem domains (i.e., classification, regression, and clustering) are varied. From this study, we uncover the following five characteristics. First, machine learning workloads exhibit significantly different memory reference patterns from traditional workloads, but they are similar regardless of learning types and problem domains. Second, in all workloads, memory reads and writes continue to appear for a wide range of memory addresses, but there is a specific time period where only reads appear. Third, among references to memory areas (i.e., code, data, heap, stack, library), library accounts for about 90% of total memory references. Fourth, there is a low popularity bias between memory pages referenced in machine learning workloads, especially for writes. Fifth, when estimating the likelihood of re-referencing, temporal locality is dominant in top 100 memory pages, but access frequency provides better information after that ranking. It is expected that the characterization of memory references conducted in this paper will be helpful in the design of memory management policies for machine learning workloads.