Jiwoo Bang, Chungyong Kim, Sunggon Kim, Qichen Chen, Cheongjun Lee, Eun-Kyu Byun, J. Lee, Hyeonsang Eom
{"title":"Finer-LRU: A Scalable Page Management Scheme for HPC Manycore Architectures","authors":"Jiwoo Bang, Chungyong Kim, Sunggon Kim, Qichen Chen, Cheongjun Lee, Eun-Kyu Byun, J. Lee, Hyeonsang Eom","doi":"10.1109/IPDPS49936.2021.00065","DOIUrl":null,"url":null,"abstract":"In HPC systems, the increasing need for a higher level of concurrency has led to packing more cores within a single chip. However, since multiple processes share memory space, the frequent access to resources in critical sections where only atomic operation has to be executed can result in poor performance. In this paper, we focus on reducing lock contention on the memory management system of an HPC manycore architecture. One of the critical sections causing severe lock contention in the I/O path is in the page management system, which uses multiple Least Recently Used (LRU) lists with a single lock instance. To solve this problem, we propose a Finer-LRU scheme, which optimizes the page reclamation process by splitting LRU lists into multiple sub-lists, each having its own lock instance. Our evaluation result shows that the Finer-LRU scheme can improve sequential write throughput by 57.03% and reduce latency by 98.94% compared to the baseline Linux kernel version 5.2.8 in the Intel Knights Landing (KNL) architecture.","PeriodicalId":372234,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS49936.2021.00065","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
In HPC systems, the increasing need for a higher level of concurrency has led to packing more cores within a single chip. However, since multiple processes share memory space, the frequent access to resources in critical sections where only atomic operation has to be executed can result in poor performance. In this paper, we focus on reducing lock contention on the memory management system of an HPC manycore architecture. One of the critical sections causing severe lock contention in the I/O path is in the page management system, which uses multiple Least Recently Used (LRU) lists with a single lock instance. To solve this problem, we propose a Finer-LRU scheme, which optimizes the page reclamation process by splitting LRU lists into multiple sub-lists, each having its own lock instance. Our evaluation result shows that the Finer-LRU scheme can improve sequential write throughput by 57.03% and reduce latency by 98.94% compared to the baseline Linux kernel version 5.2.8 in the Intel Knights Landing (KNL) architecture.