{"title":"Efficient profile-guided size optimization for native mobile applications","authors":"Kyungwoon Lee, Ellis Hoag, N. Tillmann","doi":"10.1145/3497776.3517764","DOIUrl":null,"url":null,"abstract":"Positive user experience of mobile apps demands they not only launch fast and run fluidly, but are also small in order to reduce network bandwidth from regular updates. Conventional optimizations often trade off size regressions for performance wins, making them impractical in the mobile space. Indeed, profile-guided optimization (PGO) is successful in server workloads, but is not effective at reducing size and page faults for mobile apps. Also, profiles must be collected from instrumenting builds that are up to 2X larger, so they cannot run normally on real mobile devices. In this paper, we first introduce Machine IR Profile (MIP), a lightweight instrumentation that runs at the machine IR level. Unlike the existing LLVM IR instrumentation counterpart, MIP withholds static metadata from the instrumenting binaries leading to a 2/3 reduction in size overhead. In addition, MIP collects profile data that is more relevant to optimizations in the mobile space. Then we propose three improvements to the LLVM machine outliner: (i) the global outliner overcomes the local scope of the machine outliner when using ThinLTO, (ii) the frame outliner effectively outlines irregular prologues and epilogues, and (iii) the custom outliner outlines frequent patterns occurring in Objective-C and Swift. Lastly, we present our PGO that orders hot start-up functions to minimize page faults, and controls the size optimization level (-Os vs -Oz) for functions based on their estimated execution time driven from MIP. We also order cold functions based on similarity to minimize the compressed app size. Our work improves both the size and performance of real-world mobile apps when compared to the MinSize (-Oz) optimization level: (i) in SocialApp, we reduced the compressed app size by 5.2%, the uncompressed app size by 9.6% and the page faults by 20.6%, and (ii) in ChatApp, we reduced them by 2.4%, 4.6% and 36.4%, respectively.","PeriodicalId":333281,"journal":{"name":"Proceedings of the 31st ACM SIGPLAN International Conference on Compiler Construction","volume":"74 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 31st ACM SIGPLAN International Conference on Compiler Construction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3497776.3517764","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Positive user experience of mobile apps demands they not only launch fast and run fluidly, but are also small in order to reduce network bandwidth from regular updates. Conventional optimizations often trade off size regressions for performance wins, making them impractical in the mobile space. Indeed, profile-guided optimization (PGO) is successful in server workloads, but is not effective at reducing size and page faults for mobile apps. Also, profiles must be collected from instrumenting builds that are up to 2X larger, so they cannot run normally on real mobile devices. In this paper, we first introduce Machine IR Profile (MIP), a lightweight instrumentation that runs at the machine IR level. Unlike the existing LLVM IR instrumentation counterpart, MIP withholds static metadata from the instrumenting binaries leading to a 2/3 reduction in size overhead. In addition, MIP collects profile data that is more relevant to optimizations in the mobile space. Then we propose three improvements to the LLVM machine outliner: (i) the global outliner overcomes the local scope of the machine outliner when using ThinLTO, (ii) the frame outliner effectively outlines irregular prologues and epilogues, and (iii) the custom outliner outlines frequent patterns occurring in Objective-C and Swift. Lastly, we present our PGO that orders hot start-up functions to minimize page faults, and controls the size optimization level (-Os vs -Oz) for functions based on their estimated execution time driven from MIP. We also order cold functions based on similarity to minimize the compressed app size. Our work improves both the size and performance of real-world mobile apps when compared to the MinSize (-Oz) optimization level: (i) in SocialApp, we reduced the compressed app size by 5.2%, the uncompressed app size by 9.6% and the page faults by 20.6%, and (ii) in ChatApp, we reduced them by 2.4%, 4.6% and 36.4%, respectively.
积极的手机应用用户体验要求它们不仅启动速度快、运行流畅,而且体积小,以减少定期更新带来的网络带宽。传统的优化通常会牺牲大小回归来换取性能优势,这使得它们在移动空间中不切实际。的确,配置文件引导优化(PGO)在服务器工作负载中是成功的,但在减少移动应用程序的大小和页面错误方面并不有效。此外,配置文件必须从大2倍的测试构建中收集,因此它们无法在真正的移动设备上正常运行。在本文中,我们首先介绍了机器红外配置文件(MIP),这是一种运行在机器红外级别的轻量级仪器。与现有的LLVM IR检测工具不同,MIP从检测二进制文件中保留静态元数据,从而减少了2/3的大小开销。此外,MIP收集与移动领域的优化更相关的配置文件数据。然后,我们提出了对LLVM机器大纲的三个改进:(i)全局大纲在使用ThinLTO时克服了机器大纲的局部范围,(ii)框架大纲有效地概述了不规则的序言和尾声,(iii)自定义大纲概述了在Objective-C和Swift中出现的频繁模式。最后,我们展示了我们的PGO,它可以命令热启动函数以最小化页面错误,并根据MIP驱动的估计执行时间来控制函数的大小优化级别(-Os vs -Oz)。我们还根据相似性排序冷函数,以最小化压缩后的应用程序大小。与MinSize (-Oz)优化水平相比,我们的工作提高了实际移动应用的大小和性能:(i)在SocialApp中,我们将压缩应用大小减少了5.2%,未压缩应用大小减少了9.6%,页面错误减少了20.6%,(ii)在ChatApp中,我们分别将它们减少了2.4%,4.6%和36.4%。