{"title":"How data-sharing nudges influence people's privacy preferences: A machine learning-based analysis","authors":"Yang Lu, Shujun Li, A. Freitas, A. Ioannou","doi":"10.4108/eai.21-12-2021.172440","DOIUrl":null,"url":null,"abstract":"INTRODUCTION: Many online services use data-sharing nudges to solicit personal data from their customers for personalized services. OBJECTIVES: This study aims to study people’s privacy preferences in sharing di ff erent types of personal data under di ff erent nudging conditions, how digital nudging can change their data sharing willingness, and if people’s data sharing preferences can be predicted using their responses to a questionnaire. METHODS: This paper reports a machine learning-based analysis on people’s privacy preference patterns under four di ff erent data-sharing nudging conditions (without nudging, monetary incentives, non-monetary incentives, and privacy assurance). The analysis is based on data collected from 685 UK residents who participated in a panel survey. Their self-reported willingness levels towards sharing 23 di ff erent types of personal data were analyzed by using both unsupervised (clustering) and supervised (classification) machine learning algorithms. RESULTS: The results led to a better understanding of people’s privacy preference patterns across di ff erent data-sharing nudging conditions, e.g., our participants’ preferences are distributed in a space of 48 possible profiles more sparsely than we expected, and the unexpected observation that all the three data-sharing nudging strategies led to an overall negative e ff ect: they led to a reduced level of self-reported willingness for more participants, comparing with the case of no nudging at all. Our experiments with supervised machine learning models also showed that people’s privacy (data-sharing) preference profiles can be automatically predicted with a good accuracy, even when a small questionnaire with just seven questions is used. CONCLUSION: Our work revealed a more complicated structure of people’s privacy preference profiles, which have some dependencies on the type of data nudging and the type of personal data shared. Such complicated privacy preference profiles can be e ff ectively analyzed using machine learning methods, including automatic prediction based on a small questionnaire. The negative results on the overall e ff ect of di ff erent data-sharing nudges imply that service providers should consider if and how to use such mechanisms to incentivise their consumers to share personal data. We believe that more consumer-centric and transparent methods and tools should be used to help improve trust between consumers and service providers.","PeriodicalId":335727,"journal":{"name":"EAI Endorsed Trans. Security Safety","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"EAI Endorsed Trans. Security Safety","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4108/eai.21-12-2021.172440","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
INTRODUCTION: Many online services use data-sharing nudges to solicit personal data from their customers for personalized services. OBJECTIVES: This study aims to study people’s privacy preferences in sharing di ff erent types of personal data under di ff erent nudging conditions, how digital nudging can change their data sharing willingness, and if people’s data sharing preferences can be predicted using their responses to a questionnaire. METHODS: This paper reports a machine learning-based analysis on people’s privacy preference patterns under four di ff erent data-sharing nudging conditions (without nudging, monetary incentives, non-monetary incentives, and privacy assurance). The analysis is based on data collected from 685 UK residents who participated in a panel survey. Their self-reported willingness levels towards sharing 23 di ff erent types of personal data were analyzed by using both unsupervised (clustering) and supervised (classification) machine learning algorithms. RESULTS: The results led to a better understanding of people’s privacy preference patterns across di ff erent data-sharing nudging conditions, e.g., our participants’ preferences are distributed in a space of 48 possible profiles more sparsely than we expected, and the unexpected observation that all the three data-sharing nudging strategies led to an overall negative e ff ect: they led to a reduced level of self-reported willingness for more participants, comparing with the case of no nudging at all. Our experiments with supervised machine learning models also showed that people’s privacy (data-sharing) preference profiles can be automatically predicted with a good accuracy, even when a small questionnaire with just seven questions is used. CONCLUSION: Our work revealed a more complicated structure of people’s privacy preference profiles, which have some dependencies on the type of data nudging and the type of personal data shared. Such complicated privacy preference profiles can be e ff ectively analyzed using machine learning methods, including automatic prediction based on a small questionnaire. The negative results on the overall e ff ect of di ff erent data-sharing nudges imply that service providers should consider if and how to use such mechanisms to incentivise their consumers to share personal data. We believe that more consumer-centric and transparent methods and tools should be used to help improve trust between consumers and service providers.