ProTrain: Efficient LLM Training via Memory-Aware Techniques

Hanmei Yang, Jin Zhou, Yao Fu, Xiaoqun Wang, Ramine Roane, Hui Guan, Tongping Liu
{"title":"ProTrain: Efficient LLM Training via Memory-Aware Techniques","authors":"Hanmei Yang, Jin Zhou, Yao Fu, Xiaoqun Wang, Ramine Roane, Hui Guan, Tongping Liu","doi":"arxiv-2406.08334","DOIUrl":null,"url":null,"abstract":"It is extremely memory-hungry to train Large Language Models (LLM). To solve\nthis problem, existing work exploits the combination of CPU and GPU for the\ntraining process, such as ZeRO-Offload. Such a technique largely democratizes\nbillion-scale model training, making it possible to train with few consumer\ngraphics cards. However, based on our observation, existing frameworks often\nprovide coarse-grained memory management and require experienced experts in\nconfiguration tuning, leading to suboptimal hardware utilization and\nperformance. This paper proposes ProTrain, a novel training system that\nintelligently balances memory usage and performance by coordinating memory,\ncomputation, and IO. ProTrain achieves adaptive memory management through\nChunk-Based Model State Management and Block-Wise Activation Management, guided\nby a Memory-Aware Runtime Profiler without user intervention. ProTrain does not\nchange the training algorithm and thus does not compromise accuracy.\nExperiments show that ProTrain improves training throughput by 1.43$\\times$ to\n2.71$\\times$ compared to the SOTA training systems.","PeriodicalId":501291,"journal":{"name":"arXiv - CS - Performance","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Performance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2406.08334","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

It is extremely memory-hungry to train Large Language Models (LLM). To solve this problem, existing work exploits the combination of CPU and GPU for the training process, such as ZeRO-Offload. Such a technique largely democratizes billion-scale model training, making it possible to train with few consumer graphics cards. However, based on our observation, existing frameworks often provide coarse-grained memory management and require experienced experts in configuration tuning, leading to suboptimal hardware utilization and performance. This paper proposes ProTrain, a novel training system that intelligently balances memory usage and performance by coordinating memory, computation, and IO. ProTrain achieves adaptive memory management through Chunk-Based Model State Management and Block-Wise Activation Management, guided by a Memory-Aware Runtime Profiler without user intervention. ProTrain does not change the training algorithm and thus does not compromise accuracy. Experiments show that ProTrain improves training throughput by 1.43$\times$ to 2.71$\times$ compared to the SOTA training systems.
ProTrain:通过记忆感知技术进行高效 LLM 训练
训练大型语言模型(LLM)非常耗费内存。为了解决这个问题,现有的工作利用 CPU 和 GPU 的组合来完成训练过程,例如 ZeRO-Offload。这种技术在很大程度上实现了亿万级模型训练的民主化,使使用少量消费级显卡进行训练成为可能。然而,根据我们的观察,现有框架通常提供粗粒度内存管理,需要经验丰富的专家进行配置调整,导致硬件利用率和性能达不到最优。本文提出的 ProTrain 是一种新型训练系统,它通过协调内存、计算和 IO,智能地平衡内存使用和性能。ProTrain 通过基于大块的模型状态管理(Chunk-Based Model State Management)和基于块的激活管理(Block-Wise Activation Management)实现了自适应内存管理,并由内存感知运行时分析器提供指导,无需用户干预。实验表明,与 SOTA 训练系统相比,ProTrain 将训练吞吐量提高了 1.43 倍到 2.71 倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信