Abdulrahman Alabduljabbar, Ahmed A. Abusnaina, Ülkü Meteriz-Yildiran, David A. Mohaisen
{"title":"Automated Privacy Policy Annotation with Information Highlighting Made Practical Using Deep Representations","authors":"Abdulrahman Alabduljabbar, Ahmed A. Abusnaina, Ülkü Meteriz-Yildiran, David A. Mohaisen","doi":"10.1145/3460120.3485335","DOIUrl":null,"url":null,"abstract":"The privacy policy statements are the primary mean for service providers to inform Internet users about their data collection and use practices, although they often are long and lack a specific structure. In this work, we introduce TLDR, a pipeline that employs various deep representation techniques for normalizing policies through learning and modeling, and an automated ensemble classifier for privacy policy classification. TLDR advances the state-of-the-art by (i) categorizing policy contents into nine privacy policy categories with high accuracy, (ii) detecting missing information in privacy policies, and (iii) significantly reducing policy reading time and improving understandability by users.","PeriodicalId":135883,"journal":{"name":"Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security","volume":"235 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3460120.3485335","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
The privacy policy statements are the primary mean for service providers to inform Internet users about their data collection and use practices, although they often are long and lack a specific structure. In this work, we introduce TLDR, a pipeline that employs various deep representation techniques for normalizing policies through learning and modeling, and an automated ensemble classifier for privacy policy classification. TLDR advances the state-of-the-art by (i) categorizing policy contents into nine privacy policy categories with high accuracy, (ii) detecting missing information in privacy policies, and (iii) significantly reducing policy reading time and improving understandability by users.