Tamara T. Mueller, Dmitrii Usynin, Johannes C. Paetzold, R. Braren, D. Rueckert, Georgios Kaissis
{"title":"Differentially Private Guarantees for Analytics and Machine Learning on Graphs: A Survey of Results","authors":"Tamara T. Mueller, Dmitrii Usynin, Johannes C. Paetzold, R. Braren, D. Rueckert, Georgios Kaissis","doi":"10.29012/jpc.820","DOIUrl":"https://doi.org/10.29012/jpc.820","url":null,"abstract":"We study the applications of differential privacy (DP) in the context of graph-structured data and discuss the formulations of DP applicable to the publication of graphs and their associated statistics as well as machine learning on graph-based data, including graph neural networks (GNNs). Interpreting DP guarantees in the context of graph-structured data can be challenging, as individual data points are interconnected (often non-linearly or sparsely). This connectivity complicates the computation of individual privacy loss in differentially private learning. The problem is exacerbated by an absence of a single, well-established formulation of DP in graph settings. This issue extends to the domain of GNNs, rendering private machine learning on graph-structured data a challenging task. A lack of prior systematisation work motivated us to study graph-based learning from a privacy perspective. In this work, we systematise different formulations of DP on graphs, discuss challenges and promising applications, including the GNN domain. We compare and separate works into graph analytics tasks and graph learning tasks with GNNs. We conclude our work with a discussion of open questions and potential directions for further research in this area.","PeriodicalId":52360,"journal":{"name":"Journal of Privacy and Confidentiality","volume":"20 9","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139845759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tamara T. Mueller, Dmitrii Usynin, Johannes C. Paetzold, R. Braren, D. Rueckert, Georgios Kaissis
{"title":"Differentially Private Guarantees for Analytics and Machine Learning on Graphs: A Survey of Results","authors":"Tamara T. Mueller, Dmitrii Usynin, Johannes C. Paetzold, R. Braren, D. Rueckert, Georgios Kaissis","doi":"10.29012/jpc.820","DOIUrl":"https://doi.org/10.29012/jpc.820","url":null,"abstract":"We study the applications of differential privacy (DP) in the context of graph-structured data and discuss the formulations of DP applicable to the publication of graphs and their associated statistics as well as machine learning on graph-based data, including graph neural networks (GNNs). Interpreting DP guarantees in the context of graph-structured data can be challenging, as individual data points are interconnected (often non-linearly or sparsely). This connectivity complicates the computation of individual privacy loss in differentially private learning. The problem is exacerbated by an absence of a single, well-established formulation of DP in graph settings. This issue extends to the domain of GNNs, rendering private machine learning on graph-structured data a challenging task. A lack of prior systematisation work motivated us to study graph-based learning from a privacy perspective. In this work, we systematise different formulations of DP on graphs, discuss challenges and promising applications, including the GNN domain. We compare and separate works into graph analytics tasks and graph learning tasks with GNNs. We conclude our work with a discussion of open questions and potential directions for further research in this area.","PeriodicalId":52360,"journal":{"name":"Journal of Privacy and Confidentiality","volume":"118 33","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139785637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Beyond Legal Frameworks and Security Controls For Accessing Confidential Survey Data: Engaging Data Users in Data Protection","authors":"Amy Pienta, J. Jang, Margaret Levenstein","doi":"10.29012/jpc.845","DOIUrl":"https://doi.org/10.29012/jpc.845","url":null,"abstract":"With a growing demand for data reuse and open data within the scientific ecosystem, protecting the confidentiality and privacy of survey data is increasingly important. It requires more than legal procedures and technological controls; it requires social and behavioral intervention. In this research note, we delineate the disclosure risks of various types of survey data (i.e., longitudinal data, social network data, sensitive information and biomarkers, and geographic data), the current motivation for data reuse and challenges to data protection. Despite rigorous efforts to protect data, there are still threats to mitigate the protection of confidentiality in microdata. Unintentional data breaches, protocol violations, and the misuse of data are observed even in well-established restricted data access systems, which indicates that the systems all may rely heavily on trust. Creating and maintaining that trust is critical to secure data access. We suggest four ways of building trust; User-Centered Design Practices; Promoting Trust for Protecting Confidential Data; General Training in Research Ethics; Specific Training in Data Security Protocols, with an example of a new project ‘Researcher Passport’ by the Inter-university Consortium for Political and Social Research. Continuous user-focused improvements in restricted data access systems are necessary so that we promote a culture of trust among the research and data user community, train both in the general topic of responsible research and in the specific requirements of these systems, and offer systematic and holistic solutions.","PeriodicalId":52360,"journal":{"name":"Journal of Privacy and Confidentiality","volume":"67 11","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138595982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Protecting Sensitive Data Early in the Research Data Lifecycle","authors":"Sebastian Karcher, Sefa Secen, Nicholas Weber","doi":"10.29012/jpc.846","DOIUrl":"https://doi.org/10.29012/jpc.846","url":null,"abstract":"How do researchers in fieldwork-intensive disciplines protect sensitive data in the field, how do they assess their own practices, and how do they arrive at them? This article reports the results of a qualitative study with 36 semi-structured interviews with qualitative and multi-method researchers in political science and humanitarian aid/migration studies. We find that researchers frequently feel ill-prepared to handle the management of sensitive data in the field and find that formal institutions provide little support. Instead, they use a patchwork of sources to devise strategies for protecting their informants and their data. We argue that this carries substantial risks for the security of the data as well as their potential for later sharing and re-use. We conclude with some suggestions for effectively supporting data management in fieldwork-intensive research without unduly adding to the burden on researchers conducting it.","PeriodicalId":52360,"journal":{"name":"Journal of Privacy and Confidentiality","volume":"4 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138594745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Jang, Amy Pienta, Margaret Levenstein, Joe Saul
{"title":"Restricted data management: the current practice and the future","authors":"J. Jang, Amy Pienta, Margaret Levenstein, Joe Saul","doi":"10.29012/jpc.844","DOIUrl":"https://doi.org/10.29012/jpc.844","url":null,"abstract":"Many restricted data managing organizations across the world have adapted the Five Safes framework (safe data, safe projects, safe people, safe setting, safe output) for their management of restricted and confidential data. While the Five Safes have been well integrated throughout the data life cycle, organizations observe several unintended challenges regarding data being FAIR (Findable, Accessible, Interoperable, Reusable). In the current study, we review the current practice on the restricted data management and discuss challenges and future directions, especially focusing on data use agreements, disclosure risks review, and training. In the future, restricted data managing organizations may need to proactively take into consideration reducing inequalities in access to scientific development, preventing unethical use of data in their management of restricted and confidential data, and managing various types of data.","PeriodicalId":52360,"journal":{"name":"Journal of Privacy and Confidentiality","volume":"44 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138596307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rachel Cummings, Gabriel Kaptchuk, Elissa Redmiles
{"title":"\"I need a better description\": An Investigation Into User Expectations For Differential Privacy","authors":"Rachel Cummings, Gabriel Kaptchuk, Elissa Redmiles","doi":"10.29012/jpc.813","DOIUrl":"https://doi.org/10.29012/jpc.813","url":null,"abstract":"Despite recent widespread deployment of differential privacy, relatively little is known about what users think of differential privacy. In this work, we seek to explore users' privacy expectations related to differential privacy. Specifically, we investigate (1) whether users care about the protections afforded by differential privacy, and (2) whether they are therefore more willing to share their data with differentially private systems. Further, we attempt to understand (3) users' privacy expectations of the differentially private systems they may encounter in practice and (4) their willingness to share data in such systems. To answer these questions, we use a series of rigorously conducted surveys (n=2424).
 
 We find that users care about the kinds of information leaks against which differential privacy protects and are more willing to share their private information when the risks of these leaks are less likely to happen. Additionally, we find that the ways in which differential privacy is described in-the-wild haphazardly set users' privacy expectations, which can be misleading depending on the deployment. We synthesize our results into a framework for understanding a user's willingness to share information with differentially private systems, which takes into account the interaction between the user's prior privacy concerns and how differential privacy is described.","PeriodicalId":52360,"journal":{"name":"Journal of Privacy and Confidentiality","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135947024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Synthesizing Familial Linkages for Privacy in Microdata","authors":"Gary Benedetto, Evan Totty","doi":"10.29012/jpc.767","DOIUrl":"https://doi.org/10.29012/jpc.767","url":null,"abstract":"As the Census Bureau strives to modernize its disclosure avoidance efforts in all of its outputs, synthetic data has become a successful way to provide external researchers a chance to conduct a wide variety of analyses on microdata while still satisfying the legal objective of protecting privacy of survey respondents. Some of the most useful variables for researchers are some of the trickiest to model: relationships between records. These can be family relationships, household relationships, or employer-employee relationships to name a few. This paper describes a method to match synthetic records together in a way that mimics the covariation between related records in the underlying, protected data.","PeriodicalId":52360,"journal":{"name":"Journal of Privacy and Confidentiality","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45188480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Caroline Uhlerop, Aleksandra Slavković, Stephen E Fienberg
{"title":"Privacy-Preserving Data Sharing for Genome-Wide Association Studies.","authors":"Caroline Uhlerop, Aleksandra Slavković, Stephen E Fienberg","doi":"","DOIUrl":"","url":null,"abstract":"","PeriodicalId":52360,"journal":{"name":"Journal of Privacy and Confidentiality","volume":"5 1","pages":"137-166"},"PeriodicalIF":0.0,"publicationDate":"2013-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4623434/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141064567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}