Neural predictor aided policy optimization for adversarial controlled sensing

IF 3.4 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Signal Processing Pub Date : 2025-06-13 DOI:10.1016/j.sigpro.2025.110115

Nicholas Kalouptsidis, George Stamatelis

{"title":"Neural predictor aided policy optimization for adversarial controlled sensing","authors":"Nicholas Kalouptsidis, George Stamatelis","doi":"10.1016/j.sigpro.2025.110115","DOIUrl":null,"url":null,"abstract":"<div><div>This paper is concerned with the fundamental problem of controlled sensing, namely how to optimize signal processing resources in a sensor network in order to detect the true hidden state of the environment, when the sensors are subject to adversarial attacks. The sensing task is performed by a legitimate agent who actively selects observations generated by a set of sensors and makes inference about the true state by minimizing the error probability. The adversary may have access to all or a subset of the sensors and can influence the quality of observations. Agents may have only partial access to the complete data set, leading to different beliefs about the true state, different perceptions of the error probability. To address the complexities of this problem, we define well motivated approximate structures that fill the gap of partial information. We provide three different objective functions for training a neural predictor, and we demonstrate how prediction quality is a precondition for detection performance. Based on the above concepts, we propose a novel deep reinforcement learning (DRL) algorithm, termed Predictive Proximal Policy Optimization for Adversarial Controlled Sensing (3POACS) algorithm. This algorithm combines building blocks from single agent DRL, problem specific reward reshaping procedures, and a neural predictor. Finally, we use an anomaly detection example to demonstrate the superiority of the proposed method over previous non-adversarial approaches. Experiments show that the new algorithm favorably competes with DRL algorithms with access to oracle predictors.</div></div>","PeriodicalId":49523,"journal":{"name":"Signal Processing","volume":"238 ","pages":"Article 110115"},"PeriodicalIF":3.4000,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165168425002294","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

This paper is concerned with the fundamental problem of controlled sensing, namely how to optimize signal processing resources in a sensor network in order to detect the true hidden state of the environment, when the sensors are subject to adversarial attacks. The sensing task is performed by a legitimate agent who actively selects observations generated by a set of sensors and makes inference about the true state by minimizing the error probability. The adversary may have access to all or a subset of the sensors and can influence the quality of observations. Agents may have only partial access to the complete data set, leading to different beliefs about the true state, different perceptions of the error probability. To address the complexities of this problem, we define well motivated approximate structures that fill the gap of partial information. We provide three different objective functions for training a neural predictor, and we demonstrate how prediction quality is a precondition for detection performance. Based on the above concepts, we propose a novel deep reinforcement learning (DRL) algorithm, termed Predictive Proximal Policy Optimization for Adversarial Controlled Sensing (3POACS) algorithm. This algorithm combines building blocks from single agent DRL, problem specific reward reshaping procedures, and a neural predictor. Finally, we use an anomaly detection example to demonstrate the superiority of the proposed method over previous non-adversarial approaches. Experiments show that the new algorithm favorably competes with DRL algorithms with access to oracle predictors.

查看原文本刊更多论文

对抗控制传感的神经预测辅助策略优化

本文关注的是可控传感的基本问题，即当传感器受到对抗性攻击时，如何优化传感器网络中的信号处理资源，以检测环境的真实隐藏状态。感知任务由一个合法的代理执行，该代理主动选择由一组传感器生成的观测值，并通过最小化错误概率对真实状态进行推断。攻击者可以访问全部或部分传感器，并可以影响观测的质量。代理可能只能部分访问完整的数据集，导致对真实状态的不同信念，对错误概率的不同感知。为了解决这个问题的复杂性，我们定义了动机良好的近似结构来填补部分信息的空白。我们提供了三种不同的目标函数来训练神经预测器，并证明了预测质量是检测性能的先决条件。基于上述概念，我们提出了一种新的深度强化学习（DRL）算法，称为预测近端策略优化对抗控制感知（3POACS）算法。该算法结合了来自单个代理DRL的构建块、特定问题的奖励重塑程序和神经预测器。最后，我们使用一个异常检测实例来证明所提出的方法相对于以前的非对抗性方法的优越性。实验表明，该算法在访问oracle预测器方面优于DRL算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Signal Processing 工程技术-工程：电子与电气

CiteScore

9.20

自引率

9.10%

发文量

309

审稿时长

41 days

期刊介绍： Signal Processing incorporates all aspects of the theory and practice of signal processing. It features original research work, tutorial and review articles, and accounts of practical developments. It is intended for a rapid dissemination of knowledge and experience to engineers and scientists working in the research, development or practical application of signal processing. Subject areas covered by the journal include: Signal Theory; Stochastic Processes; Detection and Estimation; Spectral Analysis; Filtering; Signal Processing Systems; Software Developments; Image Processing; Pattern Recognition; Optical Signal Processing; Digital Signal Processing; Multi-dimensional Signal Processing; Communication Signal Processing; Biomedical Signal Processing; Geophysical and Astrophysical Signal Processing; Earth Resources Signal Processing; Acoustic and Vibration Signal Processing; Data Processing; Remote Sensing; Signal Processing Technology; Radar Signal Processing; Sonar Signal Processing; Industrial Applications; New Applications.