{"title":"Cop-Flash:利用混合存储为深度神经网络训练构建一个大型、高效、持久的计算存储","authors":"Chunhua Xiao, S. Qiu, Dandan Xu","doi":"10.1109/CLOUD55607.2022.00041","DOIUrl":null,"url":null,"abstract":"Traditional computing architectures that separate computing from storage face severe limitations when processing the data that is continuously produced in the cloud and at the edge. Recently, the computational storage device (CSD) is becoming one of the critical cloud infrastructures which can overcome these limitations. Many studies utilize CSD for DNN training to extract useful information and knowledge from the data quickly and efficiently. However, all previous work has used homogeneous storage, which is not fully considered the requirements of DNN training on CSD. Thus, we exploit the leverage of hybrid NAND flash memory to optimize this problem. Nevertheless, typical hybrid storage architectures have limitations when used for DNN training. Moreover, their management strategies can not fully exploit the heterogeneity of hybrid flash memory. To address this issue, we propose a novel SLC-TLC flash memory called Co-Partitioning Flash (Cop-Flash), which utilizes two different hybrid flash memory partitioning methods to divide storage into three different properties of flash memory. Meanwhile, two key technologies are included in Cop-Flash: 1) lifetime-based I/O identifier is proposed to identify data hotness according to data lifetime to maximize the benefits of heterogeneity and minimize the impact of garbage collection. 2) Erase-aware Adaptive Dual-zone Management is proposed to increase bandwidth utilization and guarantee system reliability. We compared Cop-Flash with two related state-of-the-art hybrid storage using hard partitioning and soft partitioning as well as TLC-only flash memory under real DNN training workloads. Experimental results show that Cop-Flash improves the performance by 29.1%, 38.8%, 56.6% and outperforms them by 2.3x, 1.29x, and 8.3x in terms of lifespan.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"9 1","pages":"209-218"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cop-Flash: Utilizing hybrid storage to construct a large, efficient, and durable computational storage for DNN training\",\"authors\":\"Chunhua Xiao, S. Qiu, Dandan Xu\",\"doi\":\"10.1109/CLOUD55607.2022.00041\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Traditional computing architectures that separate computing from storage face severe limitations when processing the data that is continuously produced in the cloud and at the edge. Recently, the computational storage device (CSD) is becoming one of the critical cloud infrastructures which can overcome these limitations. Many studies utilize CSD for DNN training to extract useful information and knowledge from the data quickly and efficiently. However, all previous work has used homogeneous storage, which is not fully considered the requirements of DNN training on CSD. Thus, we exploit the leverage of hybrid NAND flash memory to optimize this problem. Nevertheless, typical hybrid storage architectures have limitations when used for DNN training. Moreover, their management strategies can not fully exploit the heterogeneity of hybrid flash memory. To address this issue, we propose a novel SLC-TLC flash memory called Co-Partitioning Flash (Cop-Flash), which utilizes two different hybrid flash memory partitioning methods to divide storage into three different properties of flash memory. Meanwhile, two key technologies are included in Cop-Flash: 1) lifetime-based I/O identifier is proposed to identify data hotness according to data lifetime to maximize the benefits of heterogeneity and minimize the impact of garbage collection. 2) Erase-aware Adaptive Dual-zone Management is proposed to increase bandwidth utilization and guarantee system reliability. We compared Cop-Flash with two related state-of-the-art hybrid storage using hard partitioning and soft partitioning as well as TLC-only flash memory under real DNN training workloads. Experimental results show that Cop-Flash improves the performance by 29.1%, 38.8%, 56.6% and outperforms them by 2.3x, 1.29x, and 8.3x in terms of lifespan.\",\"PeriodicalId\":54281,\"journal\":{\"name\":\"IEEE Cloud Computing\",\"volume\":\"9 1\",\"pages\":\"209-218\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Cloud Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CLOUD55607.2022.00041\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Cloud Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLOUD55607.2022.00041","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}
Cop-Flash: Utilizing hybrid storage to construct a large, efficient, and durable computational storage for DNN training
Traditional computing architectures that separate computing from storage face severe limitations when processing the data that is continuously produced in the cloud and at the edge. Recently, the computational storage device (CSD) is becoming one of the critical cloud infrastructures which can overcome these limitations. Many studies utilize CSD for DNN training to extract useful information and knowledge from the data quickly and efficiently. However, all previous work has used homogeneous storage, which is not fully considered the requirements of DNN training on CSD. Thus, we exploit the leverage of hybrid NAND flash memory to optimize this problem. Nevertheless, typical hybrid storage architectures have limitations when used for DNN training. Moreover, their management strategies can not fully exploit the heterogeneity of hybrid flash memory. To address this issue, we propose a novel SLC-TLC flash memory called Co-Partitioning Flash (Cop-Flash), which utilizes two different hybrid flash memory partitioning methods to divide storage into three different properties of flash memory. Meanwhile, two key technologies are included in Cop-Flash: 1) lifetime-based I/O identifier is proposed to identify data hotness according to data lifetime to maximize the benefits of heterogeneity and minimize the impact of garbage collection. 2) Erase-aware Adaptive Dual-zone Management is proposed to increase bandwidth utilization and guarantee system reliability. We compared Cop-Flash with two related state-of-the-art hybrid storage using hard partitioning and soft partitioning as well as TLC-only flash memory under real DNN training workloads. Experimental results show that Cop-Flash improves the performance by 29.1%, 38.8%, 56.6% and outperforms them by 2.3x, 1.29x, and 8.3x in terms of lifespan.
期刊介绍:
Cessation.
IEEE Cloud Computing is committed to the timely publication of peer-reviewed articles that provide innovative research ideas, applications results, and case studies in all areas of cloud computing. Topics relating to novel theory, algorithms, performance analyses and applications of techniques are covered. More specifically: Cloud software, Cloud security, Trade-offs between privacy and utility of cloud, Cloud in the business environment, Cloud economics, Cloud governance, Migrating to the cloud, Cloud standards, Development tools, Backup and recovery, Interoperability, Applications management, Data analytics, Communications protocols, Mobile cloud, Private clouds, Liability issues for data loss on clouds, Data integration, Big data, Cloud education, Cloud skill sets, Cloud energy consumption, The architecture of cloud computing, Applications in commerce, education, and industry, Infrastructure as a Service (IaaS), Platform as a Service (PaaS), Software as a Service (SaaS), Business Process as a Service (BPaaS)