{"title":"面向高性能固态硬盘的机器学习驱动缓存管理方案","authors":"Hui Sun;Chen Sun;Haoqiang Tong;Yinliang Yue;Xiao Qin","doi":"10.1109/TC.2024.3404064","DOIUrl":null,"url":null,"abstract":"NAND Flash-based solid-state drives (SSDs) have gained widespread usage in data storage thanks to their exceptional performance and low power consumption. The computational capability of SSDs has been elevated to tackle complex algorithms. Inside an SSD, a DRAM cache for frequently accessed requests reduces response time and write amplification (WA), thereby improving SSD performance and lifetime. Existing caching schemes, based on temporal locality, overlook its variations, which potentially reduces cache hit rates. Some caching schemes bolster performance via flash-aware techniques but at the expense of the cache hit rate. To address these issues, we propose a random forest machine learning \n<bold>C</b>\nlassifier-empowered \n<bold>C</b>\nache scheme named CCache, where I/O requests are classified into critical, intermediate, and non-critical ones according to their access status. After designing a machine learning model to predict these three types of requests, we implement a trie-level linked list to manage the cache placement and replacement. CCache safeguards critical requests for cache service to the greatest extent, while granting the highest priority to evicting request accessed by non-critical requests. CCache – considering chip state when processing non-critical requests – is implemented in an SSD simulator (SSDSim). CCache outperforms the alternative caching schemes, including LRU, CFLRU, LCR, NCache, ML_WP, and CCache_ANN, in terms of response time, WA, erase count, and hit ratio. The performance discrepancy between CCache and the OPT scheme is marginal. For example, CCache reduces the response time of the competitors by up to 41.9% with an average of 16.1%. CCache slashes erase counts by a maximum of 67.4%, with an average of 21.3%. The performance gap between CCache and and OPT is merely 2.0%-3.0%.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 8","pages":"2066-2080"},"PeriodicalIF":3.6000,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Machine Learning-Empowered Cache Management Scheme for High-Performance SSDs\",\"authors\":\"Hui Sun;Chen Sun;Haoqiang Tong;Yinliang Yue;Xiao Qin\",\"doi\":\"10.1109/TC.2024.3404064\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"NAND Flash-based solid-state drives (SSDs) have gained widespread usage in data storage thanks to their exceptional performance and low power consumption. The computational capability of SSDs has been elevated to tackle complex algorithms. Inside an SSD, a DRAM cache for frequently accessed requests reduces response time and write amplification (WA), thereby improving SSD performance and lifetime. Existing caching schemes, based on temporal locality, overlook its variations, which potentially reduces cache hit rates. Some caching schemes bolster performance via flash-aware techniques but at the expense of the cache hit rate. To address these issues, we propose a random forest machine learning \\n<bold>C</b>\\nlassifier-empowered \\n<bold>C</b>\\nache scheme named CCache, where I/O requests are classified into critical, intermediate, and non-critical ones according to their access status. After designing a machine learning model to predict these three types of requests, we implement a trie-level linked list to manage the cache placement and replacement. CCache safeguards critical requests for cache service to the greatest extent, while granting the highest priority to evicting request accessed by non-critical requests. CCache – considering chip state when processing non-critical requests – is implemented in an SSD simulator (SSDSim). CCache outperforms the alternative caching schemes, including LRU, CFLRU, LCR, NCache, ML_WP, and CCache_ANN, in terms of response time, WA, erase count, and hit ratio. The performance discrepancy between CCache and the OPT scheme is marginal. For example, CCache reduces the response time of the competitors by up to 41.9% with an average of 16.1%. CCache slashes erase counts by a maximum of 67.4%, with an average of 21.3%. The performance gap between CCache and and OPT is merely 2.0%-3.0%.\",\"PeriodicalId\":13087,\"journal\":{\"name\":\"IEEE Transactions on Computers\",\"volume\":\"73 8\",\"pages\":\"2066-2080\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2024-03-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Computers\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10536870/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computers","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10536870/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
A Machine Learning-Empowered Cache Management Scheme for High-Performance SSDs
NAND Flash-based solid-state drives (SSDs) have gained widespread usage in data storage thanks to their exceptional performance and low power consumption. The computational capability of SSDs has been elevated to tackle complex algorithms. Inside an SSD, a DRAM cache for frequently accessed requests reduces response time and write amplification (WA), thereby improving SSD performance and lifetime. Existing caching schemes, based on temporal locality, overlook its variations, which potentially reduces cache hit rates. Some caching schemes bolster performance via flash-aware techniques but at the expense of the cache hit rate. To address these issues, we propose a random forest machine learning
C
lassifier-empowered
C
ache scheme named CCache, where I/O requests are classified into critical, intermediate, and non-critical ones according to their access status. After designing a machine learning model to predict these three types of requests, we implement a trie-level linked list to manage the cache placement and replacement. CCache safeguards critical requests for cache service to the greatest extent, while granting the highest priority to evicting request accessed by non-critical requests. CCache – considering chip state when processing non-critical requests – is implemented in an SSD simulator (SSDSim). CCache outperforms the alternative caching schemes, including LRU, CFLRU, LCR, NCache, ML_WP, and CCache_ANN, in terms of response time, WA, erase count, and hit ratio. The performance discrepancy between CCache and the OPT scheme is marginal. For example, CCache reduces the response time of the competitors by up to 41.9% with an average of 16.1%. CCache slashes erase counts by a maximum of 67.4%, with an average of 21.3%. The performance gap between CCache and and OPT is merely 2.0%-3.0%.
期刊介绍:
The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field. It publishes papers on research in areas of current interest to the readers. These areas include, but are not limited to, the following: a) computer organizations and architectures; b) operating systems, software systems, and communication protocols; c) real-time systems and embedded systems; d) digital devices, computer components, and interconnection networks; e) specification, design, prototyping, and testing methods and tools; f) performance, fault tolerance, reliability, security, and testability; g) case studies and experimental and theoretical evaluations; and h) new and important applications and trends.