{"title":"基于自然语言处理的隐私策略变化检测","authors":"Andrick Adhikari, Rinku Dewri","doi":"10.1109/PST52912.2021.9647767","DOIUrl":null,"url":null,"abstract":"Privacy policies notify users about the privacy practices of websites, mobile apps, and other products and services. However, users rarely read them and struggle to understand their contents. Due to the complicated nature of these documents, it gets even harder to understand and take note of any changes of interest or concern when the policies are changed or revised. With advances in machine learning and natural language processing, tools that can automatically annotate sentences of policies have been developed. These annotations can help a user identify and understand relevant parts of a privacy policy. In this paper, we present our attempt to further such annotations by also detecting the important changes that occurred across sentences. Using supervised machine learning models, word-embedding, similarity matching, and structural analysis of sentences, we present a process that takes two different versions of a privacy policy as input, matches the sentences of one version to another based on semantic similarity, and identifies relevant changes between two matched sentences. We present the results and insights of applying our approach on 79 privacy policies manually downloaded from Facebook, WhatsApp, Twitter, Google, LinkedIn and Snapchat, ranging between the period of 1999 to 2020.","PeriodicalId":144610,"journal":{"name":"2021 18th International Conference on Privacy, Security and Trust (PST)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards Change Detection in Privacy Policies with Natural Language Processing\",\"authors\":\"Andrick Adhikari, Rinku Dewri\",\"doi\":\"10.1109/PST52912.2021.9647767\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Privacy policies notify users about the privacy practices of websites, mobile apps, and other products and services. However, users rarely read them and struggle to understand their contents. Due to the complicated nature of these documents, it gets even harder to understand and take note of any changes of interest or concern when the policies are changed or revised. With advances in machine learning and natural language processing, tools that can automatically annotate sentences of policies have been developed. These annotations can help a user identify and understand relevant parts of a privacy policy. In this paper, we present our attempt to further such annotations by also detecting the important changes that occurred across sentences. Using supervised machine learning models, word-embedding, similarity matching, and structural analysis of sentences, we present a process that takes two different versions of a privacy policy as input, matches the sentences of one version to another based on semantic similarity, and identifies relevant changes between two matched sentences. We present the results and insights of applying our approach on 79 privacy policies manually downloaded from Facebook, WhatsApp, Twitter, Google, LinkedIn and Snapchat, ranging between the period of 1999 to 2020.\",\"PeriodicalId\":144610,\"journal\":{\"name\":\"2021 18th International Conference on Privacy, Security and Trust (PST)\",\"volume\":\"48 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 18th International Conference on Privacy, Security and Trust (PST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PST52912.2021.9647767\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 18th International Conference on Privacy, Security and Trust (PST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PST52912.2021.9647767","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Towards Change Detection in Privacy Policies with Natural Language Processing
Privacy policies notify users about the privacy practices of websites, mobile apps, and other products and services. However, users rarely read them and struggle to understand their contents. Due to the complicated nature of these documents, it gets even harder to understand and take note of any changes of interest or concern when the policies are changed or revised. With advances in machine learning and natural language processing, tools that can automatically annotate sentences of policies have been developed. These annotations can help a user identify and understand relevant parts of a privacy policy. In this paper, we present our attempt to further such annotations by also detecting the important changes that occurred across sentences. Using supervised machine learning models, word-embedding, similarity matching, and structural analysis of sentences, we present a process that takes two different versions of a privacy policy as input, matches the sentences of one version to another based on semantic similarity, and identifies relevant changes between two matched sentences. We present the results and insights of applying our approach on 79 privacy policies manually downloaded from Facebook, WhatsApp, Twitter, Google, LinkedIn and Snapchat, ranging between the period of 1999 to 2020.