使用强化学习的时间序列数据增量学习

2022 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2022-11-01 DOI:10.1109/ICDMW58026.2022.00115

Mustafa Shuqair, J. Jimenez-shahed, B. Ghoraani

{"title":"使用强化学习的时间序列数据增量学习","authors":"Mustafa Shuqair, J. Jimenez-shahed, B. Ghoraani","doi":"10.1109/ICDMW58026.2022.00115","DOIUrl":null,"url":null,"abstract":"System monitoring has become an area of interest with the increasing growth in wearable sensors and continuous monitoring tools. However, the generalizability of the classification models to unseen incoming data remains challenging. This paper proposes a novel architecture based on reinforcement learning (RL) to incre-mentally learn patterns of time-series data and detect changes in the system state. Our rationale is that RL's ability to learn from past experiences can help increase the performance and generalizability of classification models in time-series monitoring applications. Our novel definition of the environment consists of a set of one-class anomaly detectors to define environment states based on the dynamics of the incoming data and a reward function to reward the RL agent according to its actions. A deep RL agent incrementally learns to perform continuous, binary classification predictions according to the environment states and the received reward. We applied the proposed model for detecting response to medication (ON or OFF) in patients with Parkinson's disease (PD). The PD dataset consisted of 170 minutes of time-series movement signals collected from 12 patients using two wearable sensors. Our proposed model, with a testing accuracy of 77.95%, outperformed Adaptive Boosting, Multi-layer Perceptron, and Support Vector Machines with 53.10%, 44.92%, and 52.70% testing accuracy, respectively. The proposed model had a slight decline in the F-score, decreasing from 88.15% validation score to 78.42% in testing, a significantly slight decline compared to the other three models. These evidence the potential of the proposed RL-based classifier in time-series monitoring applications as a highly generalizable model for unseen incoming data.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"95 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Incremental Learning in Time-series Data using Reinforcement Learning\",\"authors\":\"Mustafa Shuqair, J. Jimenez-shahed, B. Ghoraani\",\"doi\":\"10.1109/ICDMW58026.2022.00115\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"System monitoring has become an area of interest with the increasing growth in wearable sensors and continuous monitoring tools. However, the generalizability of the classification models to unseen incoming data remains challenging. This paper proposes a novel architecture based on reinforcement learning (RL) to incre-mentally learn patterns of time-series data and detect changes in the system state. Our rationale is that RL's ability to learn from past experiences can help increase the performance and generalizability of classification models in time-series monitoring applications. Our novel definition of the environment consists of a set of one-class anomaly detectors to define environment states based on the dynamics of the incoming data and a reward function to reward the RL agent according to its actions. A deep RL agent incrementally learns to perform continuous, binary classification predictions according to the environment states and the received reward. We applied the proposed model for detecting response to medication (ON or OFF) in patients with Parkinson's disease (PD). The PD dataset consisted of 170 minutes of time-series movement signals collected from 12 patients using two wearable sensors. Our proposed model, with a testing accuracy of 77.95%, outperformed Adaptive Boosting, Multi-layer Perceptron, and Support Vector Machines with 53.10%, 44.92%, and 52.70% testing accuracy, respectively. The proposed model had a slight decline in the F-score, decreasing from 88.15% validation score to 78.42% in testing, a significantly slight decline compared to the other three models. These evidence the potential of the proposed RL-based classifier in time-series monitoring applications as a highly generalizable model for unseen incoming data.\",\"PeriodicalId\":146687,\"journal\":{\"name\":\"2022 IEEE International Conference on Data Mining Workshops (ICDMW)\",\"volume\":\"95 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Data Mining Workshops (ICDMW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDMW58026.2022.00115\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW58026.2022.00115","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

随着可穿戴传感器和连续监测工具的日益增长，系统监测已成为一个感兴趣的领域。然而，分类模型对未知输入数据的泛化性仍然具有挑战性。本文提出了一种基于强化学习(RL)的新架构，用于增量学习时间序列数据的模式并检测系统状态的变化。我们的基本原理是，强化学习从过去的经验中学习的能力可以帮助提高时间序列监控应用中分类模型的性能和泛化性。我们对环境的新定义包括一组单类异常检测器，用于根据传入数据的动态定义环境状态，以及一个奖励函数，根据RL代理的行为对其进行奖励。深度强化学习代理根据环境状态和收到的奖励增量学习执行连续的二元分类预测。我们将提出的模型用于检测帕金森病(PD)患者对药物的反应(ON或OFF)。PD数据集包括使用两个可穿戴传感器从12名患者收集的170分钟时间序列运动信号。该模型的测试准确率为77.95%，优于自适应增强、多层感知机和支持向量机，其测试准确率分别为53.10%、44.92%和52.70%。该模型的f得分略有下降，从88.15%的验证分数下降到78.42%，与其他三个模型相比，f得分略有下降。这些证据表明，所提出的基于强化学习的分类器在时间序列监测应用中具有潜力，可以作为一种高度一般化的模型来处理未见过的传入数据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Incremental Learning in Time-series Data using Reinforcement Learning

System monitoring has become an area of interest with the increasing growth in wearable sensors and continuous monitoring tools. However, the generalizability of the classification models to unseen incoming data remains challenging. This paper proposes a novel architecture based on reinforcement learning (RL) to incre-mentally learn patterns of time-series data and detect changes in the system state. Our rationale is that RL's ability to learn from past experiences can help increase the performance and generalizability of classification models in time-series monitoring applications. Our novel definition of the environment consists of a set of one-class anomaly detectors to define environment states based on the dynamics of the incoming data and a reward function to reward the RL agent according to its actions. A deep RL agent incrementally learns to perform continuous, binary classification predictions according to the environment states and the received reward. We applied the proposed model for detecting response to medication (ON or OFF) in patients with Parkinson's disease (PD). The PD dataset consisted of 170 minutes of time-series movement signals collected from 12 patients using two wearable sensors. Our proposed model, with a testing accuracy of 77.95%, outperformed Adaptive Boosting, Multi-layer Perceptron, and Support Vector Machines with 53.10%, 44.92%, and 52.70% testing accuracy, respectively. The proposed model had a slight decline in the F-score, decreasing from 88.15% validation score to 78.42% in testing, a significantly slight decline compared to the other three models. These evidence the potential of the proposed RL-based classifier in time-series monitoring applications as a highly generalizable model for unseen incoming data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE International Conference on Data Mining Workshops (ICDMW)

自引率

0.00%

发文量