Sequential Anomaly Detection using Inverse Reinforcement Learning

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pub Date : 2019-07-25 DOI:10.1145/3292500.3330932

Min-hwan Oh, G. Iyengar

{"title":"Sequential Anomaly Detection using Inverse Reinforcement Learning","authors":"Min-hwan Oh, G. Iyengar","doi":"10.1145/3292500.3330932","DOIUrl":null,"url":null,"abstract":"One of the most interesting application scenarios in anomaly detection is when sequential data are targeted. For example, in a safety-critical environment, it is crucial to have an automatic detection system to screen the streaming data gathered by monitoring sensors and to report abnormal observations if detected in real-time. Oftentimes, stakes are much higher when these potential anomalies are intentional or goal-oriented. We propose an end-to-end framework for sequential anomaly detection using inverse reinforcement learning (IRL), whose objective is to determine the decision-making agent's underlying function which triggers his/her behavior. The proposed method takes the sequence of actions of a target agent (and possibly other meta information) as input. The agent's normal behavior is then understood by the reward function which is inferred via IRL. We use a neural network to represent a reward function. Using a learned reward function, we evaluate whether a new observation from the target agent follows a normal pattern. In order to construct a reliable anomaly detection method and take into consideration the confidence of the predicted anomaly score, we adopt a Bayesian approach for IRL. The empirical study on publicly available real-world data shows that our proposed method is effective in identifying anomalies.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"59","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3292500.3330932","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 59

Abstract

One of the most interesting application scenarios in anomaly detection is when sequential data are targeted. For example, in a safety-critical environment, it is crucial to have an automatic detection system to screen the streaming data gathered by monitoring sensors and to report abnormal observations if detected in real-time. Oftentimes, stakes are much higher when these potential anomalies are intentional or goal-oriented. We propose an end-to-end framework for sequential anomaly detection using inverse reinforcement learning (IRL), whose objective is to determine the decision-making agent's underlying function which triggers his/her behavior. The proposed method takes the sequence of actions of a target agent (and possibly other meta information) as input. The agent's normal behavior is then understood by the reward function which is inferred via IRL. We use a neural network to represent a reward function. Using a learned reward function, we evaluate whether a new observation from the target agent follows a normal pattern. In order to construct a reliable anomaly detection method and take into consideration the confidence of the predicted anomaly score, we adopt a Bayesian approach for IRL. The empirical study on publicly available real-world data shows that our proposed method is effective in identifying anomalies.

查看原文本刊更多论文

基于逆强化学习的序列异常检测

异常检测中最有趣的应用场景之一是以顺序数据为目标。例如，在安全至关重要的环境中，拥有一个自动检测系统至关重要，该系统可以筛选监控传感器收集的流数据，并在检测到异常情况时实时报告。通常，当这些潜在的异常是有意的或以目标为导向时，风险要高得多。我们提出了一个使用逆强化学习(IRL)的端到端顺序异常检测框架，其目标是确定决策代理触发其行为的底层功能。所提出的方法将目标代理(可能还有其他元信息)的动作序列作为输入。然后，通过IRL推断的奖励函数可以理解代理的正常行为。我们用神经网络来表示奖励函数。使用学习的奖励函数，我们评估来自目标代理的新观察是否遵循正常模式。为了构建一种可靠的异常检测方法，并考虑到预测异常评分的置信度，我们对IRL采用贝叶斯方法。对公开可用的实际数据的实证研究表明，我们提出的方法在识别异常方面是有效的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

自引率

0.00%

发文量