{"title":"Optimizing memory affinity with a hybrid compiler/OS approach","authors":"M. Diener, E. Cruz, M. Alves, E. Borin, P. Navaux","doi":"10.1145/3075564.3075566","DOIUrl":null,"url":null,"abstract":"Optimizing the memory access behavior is an important challenge to improve the performance and energy consumption of parallel applications on shared memory architectures. Modern systems contain complex memory hierarchies with multiple memory controllers and several levels of caches. In such machines, analyzing the affinity between threads and data to map them to the hardware hierarchy reduces the cost of memory accesses. In this paper, we introduce a hybrid technique to optimize the memory access behavior of parallel applications. It is based on a compiler optimization that inserts code to predict, at runtime, the memory access behavior of the application and an OS mechanism that uses this information to optimize the mapping of threads and data. In contrast to previous work, our proposal uses a proactive technique to improve the future memory access behavior using predictions instead of the past behavior. Our mechanism achieves substantial performance gains for a variety of parallel applications.","PeriodicalId":398898,"journal":{"name":"Proceedings of the Computing Frontiers Conference","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Computing Frontiers Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3075564.3075566","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Optimizing the memory access behavior is an important challenge to improve the performance and energy consumption of parallel applications on shared memory architectures. Modern systems contain complex memory hierarchies with multiple memory controllers and several levels of caches. In such machines, analyzing the affinity between threads and data to map them to the hardware hierarchy reduces the cost of memory accesses. In this paper, we introduce a hybrid technique to optimize the memory access behavior of parallel applications. It is based on a compiler optimization that inserts code to predict, at runtime, the memory access behavior of the application and an OS mechanism that uses this information to optimize the mapping of threads and data. In contrast to previous work, our proposal uses a proactive technique to improve the future memory access behavior using predictions instead of the past behavior. Our mechanism achieves substantial performance gains for a variety of parallel applications.