Minhyeok Kweun, Goeun Kim, Byungsoo Oh, Seong-In Jung, Taegeon Um, Woo-Yeon Lee
{"title":"PokéMem: Taming Wild Memory Consumers in Apache Spark","authors":"Minhyeok Kweun, Goeun Kim, Byungsoo Oh, Seong-In Jung, Taegeon Um, Woo-Yeon Lee","doi":"10.1109/ipdps53621.2022.00015","DOIUrl":null,"url":null,"abstract":"Apache Spark is a widely used in-memory processing system due to its high performance. For fast data processing, Spark manages in-memory data such as cached or shuffling (aggregate and sorting) data in its own managed memory pools. However, despite its sophisticated memory management scheme, we found that Spark still suffers from out-of-memory (OOM) exceptions and high garbage collection (GC) overheads when wild memory consumers, who are not tracked by Spark and execute external codes, use a large amount of memory. To resolve the problems, we propose PokéMem, which is an enhanced Spark that incorporates wild memory consumers into the managed ones to prevent them from taking up memory spaces excessively in stealth. Our main idea is to open the black-box of unmanaged memory regions in external codes by providing customized data collections. PokéMem enables fine-grained controls of created objects within running tasks, by spilling and reloading the objects of custom data collections based on the memory pressure and access patterns. To further reduce memory pressures, PokéMem exploits pre-built memory estimation models to predict the external code's memory usage and proactively acquires memory before the execution of external code, and also performs JVM heap-usage monitoring to avoid critical memory pressures. With the help of these techniques, our evaluations show that PokéMem outperforms vanilla Spark with at most 3× faster execution with 3.9× smaller GC overheads, and successfully runs workloads without OOM exception that vanilla Spark has failed to run.","PeriodicalId":321801,"journal":{"name":"2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"13 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ipdps53621.2022.00015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Apache Spark is a widely used in-memory processing system due to its high performance. For fast data processing, Spark manages in-memory data such as cached or shuffling (aggregate and sorting) data in its own managed memory pools. However, despite its sophisticated memory management scheme, we found that Spark still suffers from out-of-memory (OOM) exceptions and high garbage collection (GC) overheads when wild memory consumers, who are not tracked by Spark and execute external codes, use a large amount of memory. To resolve the problems, we propose PokéMem, which is an enhanced Spark that incorporates wild memory consumers into the managed ones to prevent them from taking up memory spaces excessively in stealth. Our main idea is to open the black-box of unmanaged memory regions in external codes by providing customized data collections. PokéMem enables fine-grained controls of created objects within running tasks, by spilling and reloading the objects of custom data collections based on the memory pressure and access patterns. To further reduce memory pressures, PokéMem exploits pre-built memory estimation models to predict the external code's memory usage and proactively acquires memory before the execution of external code, and also performs JVM heap-usage monitoring to avoid critical memory pressures. With the help of these techniques, our evaluations show that PokéMem outperforms vanilla Spark with at most 3× faster execution with 3.9× smaller GC overheads, and successfully runs workloads without OOM exception that vanilla Spark has failed to run.