关于我们细粒度行为的海量数据的预测能力

Proceedings of the Ninth ACM International Conference on Web Search and Data Mining Pub Date : 2016-02-08 DOI:10.1145/2835776.2835846

F. Provost

{"title":"关于我们细粒度行为的海量数据的预测能力","authors":"F. Provost","doi":"10.1145/2835776.2835846","DOIUrl":null,"url":null,"abstract":"What really is it about \"big data\" that makes it different from traditional data? In this talk I illustrate one important aspect: massive ultra-fine-grained data on individuals' behaviors holds remarkable predictive power. I examine several applications to marketing-related tasks, showing how machine learning methods can extract the predictive power and how the value of the data \"asset\" seems different from the value of traditional data used for predictive modeling. I then dig deeper into explaining the predictions made from massive numbers of fine-grained behaviors by applying a counter-factual framework for explaining model behavior based on treating the individual behaviors as evidence that is combined by the model. This analysis shows that the fine-grained behavior data incorporate various sorts of information that we traditionally have sought to capture by other means. For example, for marketing modeling the behavior data effectively incorporate demographics, psychographics, category interest, and purchase intent. Finally, I discuss the flip side of the coin: the remarkable predictive power based on fine-grained information on individuals raises new privacy concerns. In particular, I discuss privacy concerns based on inferences drawn about us (in contrast to privacy concerns stemming from violations to data confidentiality). The evidence counterfactual approach used to explain the predictions also can be used to provide online consumers with transparency into the reasons why inferences are drawn about them. In addition, it offers the possibility to design novel solutions such as a privacy-friendly \"cloaking device\" to inhibit inferences from being drawn based on particular behaviors.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"17 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Predictive Power of Massive Data about our Fine-Grained Behavior\",\"authors\":\"F. Provost\",\"doi\":\"10.1145/2835776.2835846\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"What really is it about \\\"big data\\\" that makes it different from traditional data? In this talk I illustrate one important aspect: massive ultra-fine-grained data on individuals' behaviors holds remarkable predictive power. I examine several applications to marketing-related tasks, showing how machine learning methods can extract the predictive power and how the value of the data \\\"asset\\\" seems different from the value of traditional data used for predictive modeling. I then dig deeper into explaining the predictions made from massive numbers of fine-grained behaviors by applying a counter-factual framework for explaining model behavior based on treating the individual behaviors as evidence that is combined by the model. This analysis shows that the fine-grained behavior data incorporate various sorts of information that we traditionally have sought to capture by other means. For example, for marketing modeling the behavior data effectively incorporate demographics, psychographics, category interest, and purchase intent. Finally, I discuss the flip side of the coin: the remarkable predictive power based on fine-grained information on individuals raises new privacy concerns. In particular, I discuss privacy concerns based on inferences drawn about us (in contrast to privacy concerns stemming from violations to data confidentiality). The evidence counterfactual approach used to explain the predictions also can be used to provide online consumers with transparency into the reasons why inferences are drawn about them. In addition, it offers the possibility to design novel solutions such as a privacy-friendly \\\"cloaking device\\\" to inhibit inferences from being drawn based on particular behaviors.\",\"PeriodicalId\":20567,\"journal\":{\"name\":\"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining\",\"volume\":\"17 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-02-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2835776.2835846\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2835776.2835846","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

“大数据”究竟与传统数据有何不同?在这次演讲中，我阐述了一个重要的方面:关于个人行为的大量超细粒度数据具有显著的预测能力。我研究了市场营销相关任务的几个应用，展示了机器学习方法如何提取预测能力，以及数据“资产”的价值与用于预测建模的传统数据的价值有何不同。然后，我通过应用一个反事实框架来解释模型行为，将个体行为视为模型结合的证据，从而更深入地解释从大量细粒度行为中做出的预测。该分析表明，细粒度的行为数据包含了我们传统上试图通过其他方式捕获的各种信息。例如，对于营销建模，行为数据有效地结合了人口统计、心理、类别兴趣和购买意图。最后，我讨论了硬币的另一面:基于个人细粒度信息的卓越预测能力引发了新的隐私问题。特别地，我根据对我们的推断来讨论隐私问题(与违反数据机密性所引起的隐私问题形成对比)。用来解释预测的证据反事实方法也可以用来为在线消费者提供透明度，让他们了解为什么会得出关于他们的推论。此外，它还提供了设计新颖解决方案的可能性，例如保护隐私的“隐形设备”，以阻止基于特定行为得出的推论。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

The Predictive Power of Massive Data about our Fine-Grained Behavior

What really is it about "big data" that makes it different from traditional data? In this talk I illustrate one important aspect: massive ultra-fine-grained data on individuals' behaviors holds remarkable predictive power. I examine several applications to marketing-related tasks, showing how machine learning methods can extract the predictive power and how the value of the data "asset" seems different from the value of traditional data used for predictive modeling. I then dig deeper into explaining the predictions made from massive numbers of fine-grained behaviors by applying a counter-factual framework for explaining model behavior based on treating the individual behaviors as evidence that is combined by the model. This analysis shows that the fine-grained behavior data incorporate various sorts of information that we traditionally have sought to capture by other means. For example, for marketing modeling the behavior data effectively incorporate demographics, psychographics, category interest, and purchase intent. Finally, I discuss the flip side of the coin: the remarkable predictive power based on fine-grained information on individuals raises new privacy concerns. In particular, I discuss privacy concerns based on inferences drawn about us (in contrast to privacy concerns stemming from violations to data confidentiality). The evidence counterfactual approach used to explain the predictions also can be used to provide online consumers with transparency into the reasons why inferences are drawn about them. In addition, it offers the possibility to design novel solutions such as a privacy-friendly "cloaking device" to inhibit inferences from being drawn based on particular behaviors.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Ninth ACM International Conference on Web Search and Data Mining

自引率

0.00%

发文量