{"title":"加速模型训练:性能反模式消除器框架","authors":"R. Singh, Mayank Mishra, Rekha Singhal","doi":"10.1145/3578356.3592596","DOIUrl":null,"url":null,"abstract":"In the realm of ML/DL training pipelines, the training-specific data preparation of complex models may consume up to 87% of the total training time. A data scientist may build training pipelines using Python data structures on GPU while being unaware of the performance antipatterns that arise due to communication between CPU and GPU during model training, etc. These antipatterns may not be easily identifiable using traditional profiling tools alone. In this paper, we propose Performance Antipatterns Eliminator Framework (PAEF), a framework to identify six performance antipatterns occurring due to data movements between CPU and GPU during training. Our framework co-relates profiles of CPU and GPU executions of the pipeline along with the static analysis of the code to identify the performance antipatterns. We further replace these antipatterns with their performant versions. We evaluate the benefits of PAEF for two industrial recommendation models, where we showcase up to 7X speedup by using PAEF over the original pipeline.","PeriodicalId":370204,"journal":{"name":"Proceedings of the 3rd Workshop on Machine Learning and Systems","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Accelerating Model Training: Performance Antipatterns Eliminator Framework\",\"authors\":\"R. Singh, Mayank Mishra, Rekha Singhal\",\"doi\":\"10.1145/3578356.3592596\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the realm of ML/DL training pipelines, the training-specific data preparation of complex models may consume up to 87% of the total training time. A data scientist may build training pipelines using Python data structures on GPU while being unaware of the performance antipatterns that arise due to communication between CPU and GPU during model training, etc. These antipatterns may not be easily identifiable using traditional profiling tools alone. In this paper, we propose Performance Antipatterns Eliminator Framework (PAEF), a framework to identify six performance antipatterns occurring due to data movements between CPU and GPU during training. Our framework co-relates profiles of CPU and GPU executions of the pipeline along with the static analysis of the code to identify the performance antipatterns. We further replace these antipatterns with their performant versions. We evaluate the benefits of PAEF for two industrial recommendation models, where we showcase up to 7X speedup by using PAEF over the original pipeline.\",\"PeriodicalId\":370204,\"journal\":{\"name\":\"Proceedings of the 3rd Workshop on Machine Learning and Systems\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 3rd Workshop on Machine Learning and Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3578356.3592596\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd Workshop on Machine Learning and Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3578356.3592596","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Accelerating Model Training: Performance Antipatterns Eliminator Framework
In the realm of ML/DL training pipelines, the training-specific data preparation of complex models may consume up to 87% of the total training time. A data scientist may build training pipelines using Python data structures on GPU while being unaware of the performance antipatterns that arise due to communication between CPU and GPU during model training, etc. These antipatterns may not be easily identifiable using traditional profiling tools alone. In this paper, we propose Performance Antipatterns Eliminator Framework (PAEF), a framework to identify six performance antipatterns occurring due to data movements between CPU and GPU during training. Our framework co-relates profiles of CPU and GPU executions of the pipeline along with the static analysis of the code to identify the performance antipatterns. We further replace these antipatterns with their performant versions. We evaluate the benefits of PAEF for two industrial recommendation models, where we showcase up to 7X speedup by using PAEF over the original pipeline.