Extracting DNN Architectures via Runtime Profiling on Mobile GPUs

IF 3.7 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Journal on Emerging and Selected Topics in Circuits and Systems Pub Date : 2024-10-30 DOI:10.1109/JETCAS.2024.3488597

Dong Hyub Kim;Jonah O’Brien Weiss;Sandip Kundu

{"title":"Extracting DNN Architectures via Runtime Profiling on Mobile GPUs","authors":"Dong Hyub Kim;Jonah O’Brien Weiss;Sandip Kundu","doi":"10.1109/JETCAS.2024.3488597","DOIUrl":null,"url":null,"abstract":"Deep Neural Networks (DNNs) have become invaluable intellectual property for AI providers due to advancements fueled by a decade of research and development. However, recent studies have demonstrated the effectiveness of model extraction attacks, which threaten this value by stealing DNN models. These attacks can lead to misuse of personal data, safety risks in critical systems, and the spread of misinformation. This paper explores model extraction attacks on DNN models deployed on mobile devices, using runtime profiles as a side-channel. Since mobile devices are resource constrained, DNN deployments require optimization efforts to reduce latency. The main hurdle in extracting DNN architectures in this scenario is that optimization techniques, such as operator-level and graph-level fusion, can obfuscate the association between runtime profile operators and their corresponding DNN layers, posing challenges for adversaries to accurately predict the computation performed. To overcome this, we propose a novel method analyzing GPU call profiles to identify the original DNN architecture. Our approach achieves full accuracy in extracting DNN architectures from a predefined set, even when layer information is obscured. For unseen architectures, a layer-by-layer hyperparameter extraction method guided by sub-layer patterns is introduced, also achieving high accuracy. This research achieves two firsts: 1) targeting mobile GPUs for DNN architecture extraction and 2) successfully extracting architectures from optimized models with fused layers.","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"14 4","pages":"620-633"},"PeriodicalIF":3.7000,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10738518/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Deep Neural Networks (DNNs) have become invaluable intellectual property for AI providers due to advancements fueled by a decade of research and development. However, recent studies have demonstrated the effectiveness of model extraction attacks, which threaten this value by stealing DNN models. These attacks can lead to misuse of personal data, safety risks in critical systems, and the spread of misinformation. This paper explores model extraction attacks on DNN models deployed on mobile devices, using runtime profiles as a side-channel. Since mobile devices are resource constrained, DNN deployments require optimization efforts to reduce latency. The main hurdle in extracting DNN architectures in this scenario is that optimization techniques, such as operator-level and graph-level fusion, can obfuscate the association between runtime profile operators and their corresponding DNN layers, posing challenges for adversaries to accurately predict the computation performed. To overcome this, we propose a novel method analyzing GPU call profiles to identify the original DNN architecture. Our approach achieves full accuracy in extracting DNN architectures from a predefined set, even when layer information is obscured. For unseen architectures, a layer-by-layer hyperparameter extraction method guided by sub-layer patterns is introduced, also achieving high accuracy. This research achieves two firsts: 1) targeting mobile GPUs for DNN architecture extraction and 2) successfully extracting architectures from optimized models with fused layers.

查看原文本刊更多论文

在移动 GPU 上通过运行时剖析提取 DNN 架构

经过十年的研发，深度神经网络（DNN）已经成为人工智能供应商的宝贵知识产权。然而，最近的研究证明了模型提取攻击的有效性，这些攻击通过窃取 DNN 模型威胁到了这一价值。这些攻击可能导致个人数据的滥用、关键系统的安全风险以及错误信息的传播。本文利用运行时配置文件作为侧通道，探讨了对部署在移动设备上的 DNN 模型的模型提取攻击。由于移动设备资源有限，DNN 部署需要进行优化以减少延迟。在这种情况下，提取 DNN 架构的主要障碍是运算符级和图级融合等优化技术会混淆运行时配置文件运算符与其相应 DNN 层之间的关联，从而给对手准确预测所执行的计算带来挑战。为了克服这一问题，我们提出了一种新方法，通过分析 GPU 调用配置文件来识别原始 DNN 架构。我们的方法能从预定义的集合中完全准确地提取 DNN 架构，即使层信息被掩盖也不例外。对于不可见的架构，我们引入了一种由子层模式引导的逐层超参数提取方法，同样达到了很高的准确率。这项研究开创了两个先河：1）针对移动 GPU 进行 DNN 架构提取；2）成功地从具有融合层的优化模型中提取架构。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Journal on Emerging and Selected Topics in Circuits and Systems ENGINEERING, ELECTRICAL & ELECTRONIC-

CiteScore

8.50

自引率

2.20%

发文量

期刊介绍： The IEEE Journal on Emerging and Selected Topics in Circuits and Systems is published quarterly and solicits, with particular emphasis on emerging areas, special issues on topics that cover the entire scope of the IEEE Circuits and Systems (CAS) Society, namely the theory, analysis, design, tools, and implementation of circuits and systems, spanning their theoretical foundations, applications, and architectures for signal and information processing.