A Cost-Aware Approach to Adversarial Robustness in Neural Networks

arXiv - STAT - Applications Pub Date : 2024-09-11 DOI:arxiv-2409.07609

Charles Meyers, Mohammad Reza Saleh Sedghpour, Tommy Löfstedt, Erik Elmroth

{"title":"A Cost-Aware Approach to Adversarial Robustness in Neural Networks","authors":"Charles Meyers, Mohammad Reza Saleh Sedghpour, Tommy Löfstedt, Erik Elmroth","doi":"arxiv-2409.07609","DOIUrl":null,"url":null,"abstract":"Considering the growing prominence of production-level AI and the threat of\nadversarial attacks that can evade a model at run-time, evaluating the\nrobustness of models to these evasion attacks is of critical importance.\nAdditionally, testing model changes likely means deploying the models to (e.g.\na car or a medical imaging device), or a drone to see how it affects\nperformance, making un-tested changes a public problem that reduces development\nspeed, increases cost of development, and makes it difficult (if not\nimpossible) to parse cause from effect. In this work, we used survival analysis\nas a cloud-native, time-efficient and precise method for predicting model\nperformance in the presence of adversarial noise. For neural networks in\nparticular, the relationships between the learning rate, batch size, training\ntime, convergence time, and deployment cost are highly complex, so researchers\ngenerally rely on benchmark datasets to assess the ability of a model to\ngeneralize beyond the training data. To address this, we propose using\naccelerated failure time models to measure the effect of hardware choice, batch\nsize, number of epochs, and test-set accuracy by using adversarial attacks to\ninduce failures on a reference model architecture before deploying the model to\nthe real world. We evaluate several GPU types and use the Tree Parzen Estimator\nto maximize model robustness and minimize model run-time simultaneously. This\nprovides a way to evaluate the model and optimise it in a single step, while\nsimultaneously allowing us to model the effect of model parameters on training\ntime, prediction time, and accuracy. Using this technique, we demonstrate that\nnewer, more-powerful hardware does decrease the training time, but with a\nmonetary and power cost that far outpaces the marginal gains in accuracy.","PeriodicalId":501172,"journal":{"name":"arXiv - STAT - Applications","volume":"44 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07609","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Considering the growing prominence of production-level AI and the threat of adversarial attacks that can evade a model at run-time, evaluating the robustness of models to these evasion attacks is of critical importance. Additionally, testing model changes likely means deploying the models to (e.g. a car or a medical imaging device), or a drone to see how it affects performance, making un-tested changes a public problem that reduces development speed, increases cost of development, and makes it difficult (if not impossible) to parse cause from effect. In this work, we used survival analysis as a cloud-native, time-efficient and precise method for predicting model performance in the presence of adversarial noise. For neural networks in particular, the relationships between the learning rate, batch size, training time, convergence time, and deployment cost are highly complex, so researchers generally rely on benchmark datasets to assess the ability of a model to generalize beyond the training data. To address this, we propose using accelerated failure time models to measure the effect of hardware choice, batch size, number of epochs, and test-set accuracy by using adversarial attacks to induce failures on a reference model architecture before deploying the model to the real world. We evaluate several GPU types and use the Tree Parzen Estimator to maximize model robustness and minimize model run-time simultaneously. This provides a way to evaluate the model and optimise it in a single step, while simultaneously allowing us to model the effect of model parameters on training time, prediction time, and accuracy. Using this technique, we demonstrate that newer, more-powerful hardware does decrease the training time, but with a monetary and power cost that far outpaces the marginal gains in accuracy.

查看原文本刊更多论文

神经网络对抗鲁棒性的成本意识方法

考虑到生产级人工智能的日益突出，以及可在运行时规避模型的对抗性攻击的威胁，评估模型对这些规避攻击的稳健性至关重要。此外，测试模型变化很可能意味着将模型部署到（例如汽车或医疗成像设备）或无人机上，以观察其对性能的影响，这使得未经测试的变化成为一个公共问题，降低了开发速度，增加了开发成本，并使因果关系难以（如果不是不可能）分辨。在这项工作中，我们将生存分析作为一种云原生、省时、精确的方法，用于预测存在对抗噪声时的模型性能。特别是对于神经网络来说，学习率、批量大小、训练时间、收敛时间和部署成本之间的关系非常复杂，因此研究人员通常依赖基准数据集来评估模型在训练数据之外的泛化能力。为了解决这个问题，我们提出使用加速故障时间模型来测量硬件选择、批量大小、历时次数和测试集准确性的影响，方法是在将模型部署到真实世界之前，使用对抗攻击在参考模型架构上引发故障。我们评估了几种 GPU 类型，并使用树状 Parzen 估算法同时最大化模型鲁棒性和最小化模型运行时间。这就提供了一种在单一步骤中评估模型和优化模型的方法，同时允许我们模拟模型参数对训练时间、预测时间和准确性的影响。利用这种技术，我们证明了更新、更强大的硬件确实能缩短训练时间，但其金钱和电力成本远远超过了准确率的边际收益。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - STAT - Applications

自引率

0.00%

发文量