Finer-LRU: A Scalable Page Management Scheme for HPC Manycore Architectures

2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2021-05-01 DOI:10.1109/IPDPS49936.2021.00065

Jiwoo Bang, Chungyong Kim, Sunggon Kim, Qichen Chen, Cheongjun Lee, Eun-Kyu Byun, J. Lee, Hyeonsang Eom

{"title":"Finer-LRU: A Scalable Page Management Scheme for HPC Manycore Architectures","authors":"Jiwoo Bang, Chungyong Kim, Sunggon Kim, Qichen Chen, Cheongjun Lee, Eun-Kyu Byun, J. Lee, Hyeonsang Eom","doi":"10.1109/IPDPS49936.2021.00065","DOIUrl":null,"url":null,"abstract":"In HPC systems, the increasing need for a higher level of concurrency has led to packing more cores within a single chip. However, since multiple processes share memory space, the frequent access to resources in critical sections where only atomic operation has to be executed can result in poor performance. In this paper, we focus on reducing lock contention on the memory management system of an HPC manycore architecture. One of the critical sections causing severe lock contention in the I/O path is in the page management system, which uses multiple Least Recently Used (LRU) lists with a single lock instance. To solve this problem, we propose a Finer-LRU scheme, which optimizes the page reclamation process by splitting LRU lists into multiple sub-lists, each having its own lock instance. Our evaluation result shows that the Finer-LRU scheme can improve sequential write throughput by 57.03% and reduce latency by 98.94% compared to the baseline Linux kernel version 5.2.8 in the Intel Knights Landing (KNL) architecture.","PeriodicalId":372234,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS49936.2021.00065","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

In HPC systems, the increasing need for a higher level of concurrency has led to packing more cores within a single chip. However, since multiple processes share memory space, the frequent access to resources in critical sections where only atomic operation has to be executed can result in poor performance. In this paper, we focus on reducing lock contention on the memory management system of an HPC manycore architecture. One of the critical sections causing severe lock contention in the I/O path is in the page management system, which uses multiple Least Recently Used (LRU) lists with a single lock instance. To solve this problem, we propose a Finer-LRU scheme, which optimizes the page reclamation process by splitting LRU lists into multiple sub-lists, each having its own lock instance. Our evaluation result shows that the Finer-LRU scheme can improve sequential write throughput by 57.03% and reduce latency by 98.94% compared to the baseline Linux kernel version 5.2.8 in the Intel Knights Landing (KNL) architecture.

查看原文本刊更多论文

Finer-LRU:用于高性能计算多核架构的可扩展页面管理方案

在高性能计算系统中，对更高级别并发性的需求日益增长，导致在单个芯片中封装更多的内核。然而，由于多个进程共享内存空间，频繁访问只需要执行原子操作的关键区域中的资源可能会导致较差的性能。本文主要研究了高性能计算多核架构下内存管理系统的锁争用问题。在I/O路径中导致严重锁争用的关键区之一是页面管理系统，它对单个锁实例使用多个最近最少使用(Least Recently Used, LRU)列表。为了解决这个问题，我们提出了一个Finer-LRU方案，该方案通过将LRU列表拆分为多个子列表来优化页面回收过程，每个子列表都有自己的锁实例。我们的评估结果表明，与Intel Knights Landing (KNL)架构中的基准Linux内核版本5.2.8相比，Finer-LRU方案可以将顺序写吞吐量提高57.03%，将延迟降低98.94%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

自引率

0.00%

发文量