Marihan Amein, Zhuoran Xiong, Olivier Therrien, B. Meyer, W. Gross
{"title":"Work-in-Progress: SuperNAS: Fast Multi-Objective SuperNet Architecture Search for Semantic Segmentation","authors":"Marihan Amein, Zhuoran Xiong, Olivier Therrien, B. Meyer, W. Gross","doi":"10.1109/CASES55004.2022.00024","DOIUrl":"https://doi.org/10.1109/CASES55004.2022.00024","url":null,"abstract":"We present SuperNAS, a fast multi-objective neural architecture search framework for semantic segmentation. SuperNAS subsamples the structure and pre-trained parameters of DeepLabV3+, without fine-tuning, dramatically reducing training time during search. To further reduce candidate evaluation time, we use a subset of the validation dataset during search. Only the final, Pareto-dominant, candidates are ultimately fine-tuned using the complete training set. We evaluate SuperNAS by searching for models that effectively trade accuracy and computational cost on the PASCAL VOC 2012 dataset. SuperNAS finds competitive designs quickly, e.g., taking just 0.5 GPU days to discover a DeepLabV3+ variant that reduces FLOPs and parameters by 10% and 20% respectively, for less than 3% increased error.","PeriodicalId":331181,"journal":{"name":"2022 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127848048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Work-in-Progress: RISC-V Based Low-cost Embedded Trace Processing System","authors":"Xiao Hu, Yao Wang, Xuan-yi Gao","doi":"10.1109/CASES55004.2022.00022","DOIUrl":"https://doi.org/10.1109/CASES55004.2022.00022","url":null,"abstract":"Although on-chip Trace debugging plays a key role in post-silicon debug and software optimizations, it suffers from massive trace information handling with limited on-chip hardware resources in embedded SoC processors. To this end, this paper proposes a Low-cost Embedded Trace Processing System (LE-TPS). LE-TPS employs a low-cost RISC-V core with customized trace handling instructions to exploit the underutilized resources of existing SoCs. This helps LE-TPS to collect, store and transmit the trace information in a way with low hardware cost, software independent feature, and minimal performance overhead. We believe that LE-TPS could be effective in post-silicon debug and software optimizations.","PeriodicalId":331181,"journal":{"name":"2022 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125956965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amir H. Ashouri, Mostafa Elhoushi, Yu-Wei Hua, Xiang Wang, M. A. Manzoor, Bryan Chan, Yaoqing Gao
{"title":"Work-in-Progress: MLGOPerf: An ML Guided Inliner to Optimize Performance","authors":"Amir H. Ashouri, Mostafa Elhoushi, Yu-Wei Hua, Xiang Wang, M. A. Manzoor, Bryan Chan, Yaoqing Gao","doi":"10.1109/CASES55004.2022.00008","DOIUrl":"https://doi.org/10.1109/CASES55004.2022.00008","url":null,"abstract":"This paper presents MLGOPerf; the first end-to-end framework capable of optimizing performance using LLVM’s ML-Inliner. It employs a secondary ML model to generate rewards used for training a retargeted Reinforcement learning agent, previously used as the primary model by MLGO. It does so by predicting the post-inlining speedup of a function under analysis and it enables a fast training framework for the primary model which otherwise wouldn’t be practical. The experimental results show MLGOPerf is able to gain up to 1.8% with respect to LLVM’s optimization at O3 when trained for performance on SPEC CPU2006. Furthermore, the proposed approach provides up to 26% increased opportunities to autotune code regions for our benchmarks which can be translated into an additional 3.7% speedup value.","PeriodicalId":331181,"journal":{"name":"2022 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115597603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}