用于顺序决策的人在环系统的自适应隐私感知强化学习

Proceedings of the 8th ACM/IEEE Conference on Internet of Things Design and Implementation Pub Date : 2023-03-07 DOI:10.1145/3576842.3582325

Mojtaba Taherisadr, S. Stavroulakis, Salma Elmalaki

{"title":"用于顺序决策的人在环系统的自适应隐私感知强化学习","authors":"Mojtaba Taherisadr, S. Stavroulakis, Salma Elmalaki","doi":"10.1145/3576842.3582325","DOIUrl":null,"url":null,"abstract":"Reinforcement learning (RL) presents numerous benefits compared to rule-based approaches in various applications. Privacy concerns have grown with the widespread use of RL trained with privacy-sensitive data in IoT devices, especially for human-in-the-loop systems. On the one hand, RL methods enhance the user experience by trying to adapt to the highly dynamic nature of humans. On the other hand, trained policies can leak the user’s private information. Recent attention has been drawn to designing privacy-aware RL algorithms while maintaining an acceptable system utility. A central challenge in designing privacy-aware RL, especially for human-in-the-loop systems, is that humans have intrinsic variability, and their preferences and behavior evolve. The effect of one privacy leak mitigation can differ for the same human or across different humans over time. Hence, we can not design one fixed model for privacy-aware RL that fits all. To that end, we propose adaPARL, an adaptive approach for privacy-aware RL, especially for human-in-the-loop IoT systems. adaPARL provides a personalized privacy-utility trade-off depending on human behavior and preference. We validate the proposed adaPARL on two IoT applications, namely (i) Human-in-the-Loop Smart Home and (ii) Human-in-the-Loop Virtual Reality (VR) Smart Classroom. Results obtained on these two applications validate the generality of adaPARL and its ability to provide a personalized privacy-utility trade-off. On average, adaPARL improves the utility by while reducing the privacy leak by on average.","PeriodicalId":266438,"journal":{"name":"Proceedings of the 8th ACM/IEEE Conference on Internet of Things Design and Implementation","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"adaPARL: Adaptive Privacy-Aware Reinforcement Learning for Sequential Decision Making Human-in-the-Loop Systems\",\"authors\":\"Mojtaba Taherisadr, S. Stavroulakis, Salma Elmalaki\",\"doi\":\"10.1145/3576842.3582325\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Reinforcement learning (RL) presents numerous benefits compared to rule-based approaches in various applications. Privacy concerns have grown with the widespread use of RL trained with privacy-sensitive data in IoT devices, especially for human-in-the-loop systems. On the one hand, RL methods enhance the user experience by trying to adapt to the highly dynamic nature of humans. On the other hand, trained policies can leak the user’s private information. Recent attention has been drawn to designing privacy-aware RL algorithms while maintaining an acceptable system utility. A central challenge in designing privacy-aware RL, especially for human-in-the-loop systems, is that humans have intrinsic variability, and their preferences and behavior evolve. The effect of one privacy leak mitigation can differ for the same human or across different humans over time. Hence, we can not design one fixed model for privacy-aware RL that fits all. To that end, we propose adaPARL, an adaptive approach for privacy-aware RL, especially for human-in-the-loop IoT systems. adaPARL provides a personalized privacy-utility trade-off depending on human behavior and preference. We validate the proposed adaPARL on two IoT applications, namely (i) Human-in-the-Loop Smart Home and (ii) Human-in-the-Loop Virtual Reality (VR) Smart Classroom. Results obtained on these two applications validate the generality of adaPARL and its ability to provide a personalized privacy-utility trade-off. On average, adaPARL improves the utility by while reducing the privacy leak by on average.\",\"PeriodicalId\":266438,\"journal\":{\"name\":\"Proceedings of the 8th ACM/IEEE Conference on Internet of Things Design and Implementation\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 8th ACM/IEEE Conference on Internet of Things Design and Implementation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3576842.3582325\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 8th ACM/IEEE Conference on Internet of Things Design and Implementation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3576842.3582325","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

与基于规则的方法相比，强化学习(RL)在各种应用中具有许多优点。随着在物联网设备中广泛使用经过隐私敏感数据训练的强化学习，特别是对于人在环系统，隐私问题日益严重。一方面，强化学习方法通过尝试适应人类的高度动态特性来增强用户体验。另一方面，经过训练的策略可能会泄露用户的私有信息。最近的注意力被吸引到设计隐私感知的强化学习算法，同时保持一个可接受的系统实用程序。设计隐私感知强化学习的一个核心挑战，特别是对于人在环系统，是人类具有内在的可变性，他们的偏好和行为是进化的。随着时间的推移，同一个人或不同的人的隐私泄漏缓解效果可能会有所不同。因此，我们不能为隐私感知强化学习设计一个适合所有人的固定模型。为此，我们提出了adaPARL，这是一种适用于隐私感知强化学习的自适应方法，特别是适用于人在环物联网系统。adaPARL根据人的行为和偏好提供了个性化的隐私-效用权衡。我们在两个物联网应用中验证了所提出的adaPARL，即(i)人在环智能家居和(ii)人在环虚拟现实(VR)智能教室。在这两个应用程序上获得的结果验证了adaPARL的通用性及其提供个性化隐私与实用程序之间权衡的能力。平均而言，adaPARL提高了实用程序，同时平均减少了隐私泄漏。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

adaPARL: Adaptive Privacy-Aware Reinforcement Learning for Sequential Decision Making Human-in-the-Loop Systems

Reinforcement learning (RL) presents numerous benefits compared to rule-based approaches in various applications. Privacy concerns have grown with the widespread use of RL trained with privacy-sensitive data in IoT devices, especially for human-in-the-loop systems. On the one hand, RL methods enhance the user experience by trying to adapt to the highly dynamic nature of humans. On the other hand, trained policies can leak the user’s private information. Recent attention has been drawn to designing privacy-aware RL algorithms while maintaining an acceptable system utility. A central challenge in designing privacy-aware RL, especially for human-in-the-loop systems, is that humans have intrinsic variability, and their preferences and behavior evolve. The effect of one privacy leak mitigation can differ for the same human or across different humans over time. Hence, we can not design one fixed model for privacy-aware RL that fits all. To that end, we propose adaPARL, an adaptive approach for privacy-aware RL, especially for human-in-the-loop IoT systems. adaPARL provides a personalized privacy-utility trade-off depending on human behavior and preference. We validate the proposed adaPARL on two IoT applications, namely (i) Human-in-the-Loop Smart Home and (ii) Human-in-the-Loop Virtual Reality (VR) Smart Classroom. Results obtained on these two applications validate the generality of adaPARL and its ability to provide a personalized privacy-utility trade-off. On average, adaPARL improves the utility by while reducing the privacy leak by on average.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 8th ACM/IEEE Conference on Internet of Things Design and Implementation

自引率

0.00%

发文量