MarvelHideDroid:基于安卓虚拟化的可靠即时数据匿名化

IF 4 3区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Francesco Pagano , Luca Verderame , Enrico Russo , Alessio Merlo
{"title":"MarvelHideDroid:基于安卓虚拟化的可靠即时数据匿名化","authors":"Francesco Pagano ,&nbsp;Luca Verderame ,&nbsp;Enrico Russo ,&nbsp;Alessio Merlo","doi":"10.1016/j.compeleceng.2024.109882","DOIUrl":null,"url":null,"abstract":"<div><div>Modern mobile applications harvest many user-generated events during execution using proper libraries called <em>analytic libraries</em>. The collection of such events allows the app developers to acquire helpful information to further improve the app. The same collected events are likewise an essential source of information for analytic library providers (e.g., Google and Meta) to understand users’ preferences. However, the user is not involved in this process. To counteract this problem, some proposals arose from legal (e.g., General Data Protection Regulation (GDPR)) and research perspectives. Concerning the latter point, some research efforts led to the definition of solutions for the Android ecosystem that allow one to limit the gathering of such data before the analytic libraries collect it or give the user control of the process. To this aim, <em>HideDroid</em> was the first proposal to allow the user to define different privacy levels for each app installed on the device by leveraging k-anonymity and differential privacy techniques. Subsequently, <em>VirtualHideDroid</em> extended HideDroid by taking advantage of the same approach to virtualized Android environments, in which an application (plugin) can run within another application (container). In this scenario, VirtualHideDroid anonymizes user event data running as the container app. However, according to standard threat models regarding virtualized Android environments, assuming that the container app is fully trusted is too optimistic in real deployments.</div><div>For this reason, in this paper, we extend the work of the original VirtualHideDroid work by assuming that the same tool may be untrusted, i.e., controlled by an external attacker that has access to the container app, thereby having full access to the user data. To solve this problem, we define a new approach, named <em>MarvelHideDroid</em>, which gives reliable anonymization of event data in the Plugin app, even in the event of a malicious/compromised container. Moreover, and differently from VirtualHideDroid, <em>MarvelHideDroid</em> relies on LLM to automatically build up the generalizations required by k-anonymity, resulting in an anonymization strategy that is more reliable against modification in the data structure of the events captured by the analytic libraries. We empirically demonstrate the viability and reliability of the proposal by testing an implementation of <em>MarvelHideDroid</em> on a set of real Android apps in a virtualized environment.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"121 ","pages":"Article 109882"},"PeriodicalIF":4.0000,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MarvelHideDroid: Reliable on-the-fly data anonymization based on Android virtualization\",\"authors\":\"Francesco Pagano ,&nbsp;Luca Verderame ,&nbsp;Enrico Russo ,&nbsp;Alessio Merlo\",\"doi\":\"10.1016/j.compeleceng.2024.109882\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Modern mobile applications harvest many user-generated events during execution using proper libraries called <em>analytic libraries</em>. The collection of such events allows the app developers to acquire helpful information to further improve the app. The same collected events are likewise an essential source of information for analytic library providers (e.g., Google and Meta) to understand users’ preferences. However, the user is not involved in this process. To counteract this problem, some proposals arose from legal (e.g., General Data Protection Regulation (GDPR)) and research perspectives. Concerning the latter point, some research efforts led to the definition of solutions for the Android ecosystem that allow one to limit the gathering of such data before the analytic libraries collect it or give the user control of the process. To this aim, <em>HideDroid</em> was the first proposal to allow the user to define different privacy levels for each app installed on the device by leveraging k-anonymity and differential privacy techniques. Subsequently, <em>VirtualHideDroid</em> extended HideDroid by taking advantage of the same approach to virtualized Android environments, in which an application (plugin) can run within another application (container). In this scenario, VirtualHideDroid anonymizes user event data running as the container app. However, according to standard threat models regarding virtualized Android environments, assuming that the container app is fully trusted is too optimistic in real deployments.</div><div>For this reason, in this paper, we extend the work of the original VirtualHideDroid work by assuming that the same tool may be untrusted, i.e., controlled by an external attacker that has access to the container app, thereby having full access to the user data. To solve this problem, we define a new approach, named <em>MarvelHideDroid</em>, which gives reliable anonymization of event data in the Plugin app, even in the event of a malicious/compromised container. Moreover, and differently from VirtualHideDroid, <em>MarvelHideDroid</em> relies on LLM to automatically build up the generalizations required by k-anonymity, resulting in an anonymization strategy that is more reliable against modification in the data structure of the events captured by the analytic libraries. We empirically demonstrate the viability and reliability of the proposal by testing an implementation of <em>MarvelHideDroid</em> on a set of real Android apps in a virtualized environment.</div></div>\",\"PeriodicalId\":50630,\"journal\":{\"name\":\"Computers & Electrical Engineering\",\"volume\":\"121 \",\"pages\":\"Article 109882\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2024-11-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Electrical Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0045790624008085\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790624008085","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

摘要

现代移动应用程序使用称为分析库的适当库在执行过程中收集许多用户生成的事件。通过收集这些事件,应用程序开发人员可以获得有用的信息,从而进一步改进应用程序。同样,收集到的事件也是分析库提供商(如 Google 和 Meta)了解用户偏好的重要信息来源。然而,用户并不参与这一过程。为解决这一问题,从法律(如《通用数据保护条例》(GDPR))和研究角度提出了一些建议。关于后一点,一些研究工作导致为安卓生态系统定义了解决方案,允许人们在分析库收集数据之前限制此类数据的收集,或让用户控制这一过程。为此,HideDroid 是第一个允许用户利用 k-anonymity 和差异隐私技术为设备上安装的每个应用程序定义不同隐私级别的提案。随后,VirtualHideDroid 对 HideDroid 进行了扩展,将同样的方法用于虚拟化安卓环境,其中一个应用程序(插件)可以在另一个应用程序(容器)中运行。在这种情况下,VirtualHideDroid 会对作为容器应用程序运行的用户事件数据进行匿名处理。然而,根据有关虚拟化安卓环境的标准威胁模型,假设容器应用程序是完全可信的,这在实际部署中过于乐观。为此,我们在本文中扩展了最初 VirtualHideDroid 的工作,假设同一工具可能是不可信的,即由外部攻击者控制,而外部攻击者可以访问容器应用程序,从而完全访问用户数据。为了解决这个问题,我们定义了一种名为 MarvelHideDroid 的新方法,即使在容器遭到恶意/破坏的情况下,也能可靠地匿名化插件应用程序中的事件数据。此外,与 VirtualHideDroid 不同的是,MarvelHideDroid 依靠 LLM 自动建立 k-anonymity 所需的泛化,从而使匿名策略在分析库捕获的事件数据结构发生修改时更加可靠。我们通过在虚拟环境中测试 MarvelHideDroid 在一组真实 Android 应用程序上的实施情况,实证证明了该建议的可行性和可靠性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
MarvelHideDroid: Reliable on-the-fly data anonymization based on Android virtualization
Modern mobile applications harvest many user-generated events during execution using proper libraries called analytic libraries. The collection of such events allows the app developers to acquire helpful information to further improve the app. The same collected events are likewise an essential source of information for analytic library providers (e.g., Google and Meta) to understand users’ preferences. However, the user is not involved in this process. To counteract this problem, some proposals arose from legal (e.g., General Data Protection Regulation (GDPR)) and research perspectives. Concerning the latter point, some research efforts led to the definition of solutions for the Android ecosystem that allow one to limit the gathering of such data before the analytic libraries collect it or give the user control of the process. To this aim, HideDroid was the first proposal to allow the user to define different privacy levels for each app installed on the device by leveraging k-anonymity and differential privacy techniques. Subsequently, VirtualHideDroid extended HideDroid by taking advantage of the same approach to virtualized Android environments, in which an application (plugin) can run within another application (container). In this scenario, VirtualHideDroid anonymizes user event data running as the container app. However, according to standard threat models regarding virtualized Android environments, assuming that the container app is fully trusted is too optimistic in real deployments.
For this reason, in this paper, we extend the work of the original VirtualHideDroid work by assuming that the same tool may be untrusted, i.e., controlled by an external attacker that has access to the container app, thereby having full access to the user data. To solve this problem, we define a new approach, named MarvelHideDroid, which gives reliable anonymization of event data in the Plugin app, even in the event of a malicious/compromised container. Moreover, and differently from VirtualHideDroid, MarvelHideDroid relies on LLM to automatically build up the generalizations required by k-anonymity, resulting in an anonymization strategy that is more reliable against modification in the data structure of the events captured by the analytic libraries. We empirically demonstrate the viability and reliability of the proposal by testing an implementation of MarvelHideDroid on a set of real Android apps in a virtualized environment.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computers & Electrical Engineering
Computers & Electrical Engineering 工程技术-工程:电子与电气
CiteScore
9.20
自引率
7.00%
发文量
661
审稿时长
47 days
期刊介绍: The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency. Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信