DescribeCtx: Context-Aware Description Synthesis for Sensitive Behaviors in Mobile Apps

2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE) Pub Date : 2022-05-01 DOI:10.1145/3510003.3510058

Shao Yang, Yuehan Wang, Y. Yao, Haoyu Wang, Yanfang Ye, Xusheng Xiao

{"title":"DescribeCtx: Context-Aware Description Synthesis for Sensitive Behaviors in Mobile Apps","authors":"Shao Yang, Yuehan Wang, Y. Yao, Haoyu Wang, Yanfang Ye, Xusheng Xiao","doi":"10.1145/3510003.3510058","DOIUrl":null,"url":null,"abstract":"While mobile applications (i.e., apps) are becoming capable of handling various needs from users, their increasing access to sensitive data raises privacy concerns. To inform such sensitive behaviors to users, existing techniques propose to automatically identify explanatory sentences from app descriptions; however, many sensitive behaviors are not explained in the corresponding app descriptions. There also exist general techniques that translate code to sentences. However, these techniques lack the vocabulary to explain the uses of sensitive data and fail to consider the context (i.e., the app functionalities) of the sensitive behaviors. To address these limitations, we propose Describectx, a context-aware description synthesis approach that trains a neural machine translation model using a large set of popular apps, and generates app-specific descriptions for sensitive behaviors. Specifically, Describectx encodes three heterogeneous sources as input, i.e., vocabularies provided by privacy policies, behavior summary provided by the call graphs in code, and contextual information provided by GUI texts. Our evaluations on 1,262 Android apps show that, compared with existing baselines, Describectx produces more accurate descriptions (24.96 in BLEU) and achieves higher user ratings with respect to the reference sen-tences manually identified in the app descriptions.","PeriodicalId":202896,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3510003.3510058","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

While mobile applications (i.e., apps) are becoming capable of handling various needs from users, their increasing access to sensitive data raises privacy concerns. To inform such sensitive behaviors to users, existing techniques propose to automatically identify explanatory sentences from app descriptions; however, many sensitive behaviors are not explained in the corresponding app descriptions. There also exist general techniques that translate code to sentences. However, these techniques lack the vocabulary to explain the uses of sensitive data and fail to consider the context (i.e., the app functionalities) of the sensitive behaviors. To address these limitations, we propose Describectx, a context-aware description synthesis approach that trains a neural machine translation model using a large set of popular apps, and generates app-specific descriptions for sensitive behaviors. Specifically, Describectx encodes three heterogeneous sources as input, i.e., vocabularies provided by privacy policies, behavior summary provided by the call graphs in code, and contextual information provided by GUI texts. Our evaluations on 1,262 Android apps show that, compared with existing baselines, Describectx produces more accurate descriptions (24.96 in BLEU) and achieves higher user ratings with respect to the reference sen-tences manually identified in the app descriptions.

查看原文本刊更多论文

descripbectx:移动应用中敏感行为的上下文感知描述合成

虽然移动应用程序(即应用程序)正变得能够处理用户的各种需求，但它们对敏感数据的访问越来越多，引发了对隐私的担忧。为了将这些敏感行为告知用户，现有的技术建议从应用描述中自动识别解释性句子;然而，很多敏感行为并没有在相应的应用描述中解释。也存在将代码翻译成句子的一般技术。然而，这些技术缺乏解释敏感数据使用的词汇表，也没有考虑敏感行为的上下文(即应用程序功能)。为了解决这些限制，我们提出了descripbectx，这是一种上下文感知的描述合成方法，它使用大量流行的应用程序来训练神经机器翻译模型，并为敏感行为生成特定于应用程序的描述。具体来说，descripbectx将三个异构源编码为输入，即由隐私策略提供的词汇表，由代码中的调用图提供的行为摘要，以及由GUI文本提供的上下文信息。我们对1262款Android应用的评估表明，与现有的基线相比，descripbectx产生了更准确的描述(BLEU为24.96)，并且在应用描述中手动识别的参考句子方面获得了更高的用户评分。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)

自引率

0.00%

发文量