Shao Yang, Yuehan Wang, Y. Yao, Haoyu Wang, Yanfang Ye, Xusheng Xiao
{"title":"DescribeCtx: Context-Aware Description Synthesis for Sensitive Behaviors in Mobile Apps","authors":"Shao Yang, Yuehan Wang, Y. Yao, Haoyu Wang, Yanfang Ye, Xusheng Xiao","doi":"10.1145/3510003.3510058","DOIUrl":null,"url":null,"abstract":"While mobile applications (i.e., apps) are becoming capable of handling various needs from users, their increasing access to sensitive data raises privacy concerns. To inform such sensitive behaviors to users, existing techniques propose to automatically identify explanatory sentences from app descriptions; however, many sensitive behaviors are not explained in the corresponding app descriptions. There also exist general techniques that translate code to sentences. However, these techniques lack the vocabulary to explain the uses of sensitive data and fail to consider the context (i.e., the app functionalities) of the sensitive behaviors. To address these limitations, we propose Describectx, a context-aware description synthesis approach that trains a neural machine translation model using a large set of popular apps, and generates app-specific descriptions for sensitive behaviors. Specifically, Describectx encodes three heterogeneous sources as input, i.e., vocabularies provided by privacy policies, behavior summary provided by the call graphs in code, and contextual information provided by GUI texts. Our evaluations on 1,262 Android apps show that, compared with existing baselines, Describectx produces more accurate descriptions (24.96 in BLEU) and achieves higher user ratings with respect to the reference sen-tences manually identified in the app descriptions.","PeriodicalId":202896,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3510003.3510058","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
While mobile applications (i.e., apps) are becoming capable of handling various needs from users, their increasing access to sensitive data raises privacy concerns. To inform such sensitive behaviors to users, existing techniques propose to automatically identify explanatory sentences from app descriptions; however, many sensitive behaviors are not explained in the corresponding app descriptions. There also exist general techniques that translate code to sentences. However, these techniques lack the vocabulary to explain the uses of sensitive data and fail to consider the context (i.e., the app functionalities) of the sensitive behaviors. To address these limitations, we propose Describectx, a context-aware description synthesis approach that trains a neural machine translation model using a large set of popular apps, and generates app-specific descriptions for sensitive behaviors. Specifically, Describectx encodes three heterogeneous sources as input, i.e., vocabularies provided by privacy policies, behavior summary provided by the call graphs in code, and contextual information provided by GUI texts. Our evaluations on 1,262 Android apps show that, compared with existing baselines, Describectx produces more accurate descriptions (24.96 in BLEU) and achieves higher user ratings with respect to the reference sen-tences manually identified in the app descriptions.