具有Wasserstein距离的无偏图表示的公平图自编码器

2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI:10.1109/ICDM51629.2021.00122

Wei Fan, Kunpeng Liu, Rui Xie, Hao Liu, Hui Xiong, Yanjie Fu

{"title":"具有Wasserstein距离的无偏图表示的公平图自编码器","authors":"Wei Fan, Kunpeng Liu, Rui Xie, Hao Liu, Hui Xiong, Yanjie Fu","doi":"10.1109/ICDM51629.2021.00122","DOIUrl":null,"url":null,"abstract":"The fairness issue is very important in deploying machine learning models as algorithms widely used in human society can be easily in discrimination. Researchers have studied disparity on tabular data a lot and proposed many methods to relieve bias. However, studies towards unfairness in graph are still at early stage while graph data that often represent connections among people in real-world applications can easily give rise to fairness issues and thus should be attached to great importance. Fair representation learning is one of the most effective methods to relieve bias, which aims to generate hidden representations of input data while obfuscating sensitive information. In graph setting, learning fair representations of graph (also called fair graph embeddings) is effective to solve graph unfairness problems. However, most existing works of fair graph embeddings only study fairness in a coarse granularity (i.e., group fairness), but overlook individual fairness. In this paper, we study fair graph representations from different levels. Specifically, we consider both group fairness and individual fairness on graph. To debias graph embeddings, we propose FairGAE, a fair graph auto-encoder model, to derive unbiased graph embeddings based on the tailor-designed fair Graph Convolution Network (GCN) layers. Then, to achieve multi-level fairness, we design a Wasserstein distance based regularizer to learn the optimal transport for fairer embeddings. To overcome the efficiency concern, we further bring up Sinkhorn divergence as the approximations of Wasserstein cost for computation. Finally, we apply the learned unbiased embeddings into the node classification task and conduct extensive experiments on two real-world graph datasets to demonstrate the improved performances of our approach.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Fair Graph Auto-Encoder for Unbiased Graph Representations with Wasserstein Distance\",\"authors\":\"Wei Fan, Kunpeng Liu, Rui Xie, Hao Liu, Hui Xiong, Yanjie Fu\",\"doi\":\"10.1109/ICDM51629.2021.00122\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The fairness issue is very important in deploying machine learning models as algorithms widely used in human society can be easily in discrimination. Researchers have studied disparity on tabular data a lot and proposed many methods to relieve bias. However, studies towards unfairness in graph are still at early stage while graph data that often represent connections among people in real-world applications can easily give rise to fairness issues and thus should be attached to great importance. Fair representation learning is one of the most effective methods to relieve bias, which aims to generate hidden representations of input data while obfuscating sensitive information. In graph setting, learning fair representations of graph (also called fair graph embeddings) is effective to solve graph unfairness problems. However, most existing works of fair graph embeddings only study fairness in a coarse granularity (i.e., group fairness), but overlook individual fairness. In this paper, we study fair graph representations from different levels. Specifically, we consider both group fairness and individual fairness on graph. To debias graph embeddings, we propose FairGAE, a fair graph auto-encoder model, to derive unbiased graph embeddings based on the tailor-designed fair Graph Convolution Network (GCN) layers. Then, to achieve multi-level fairness, we design a Wasserstein distance based regularizer to learn the optimal transport for fairer embeddings. To overcome the efficiency concern, we further bring up Sinkhorn divergence as the approximations of Wasserstein cost for computation. Finally, we apply the learned unbiased embeddings into the node classification task and conduct extensive experiments on two real-world graph datasets to demonstrate the improved performances of our approach.\",\"PeriodicalId\":320970,\"journal\":{\"name\":\"2021 IEEE International Conference on Data Mining (ICDM)\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Data Mining (ICDM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDM51629.2021.00122\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Data Mining (ICDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM51629.2021.00122","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

摘要

公平性问题在部署机器学习模型时非常重要，因为在人类社会中广泛使用的算法很容易产生歧视。研究者对表格数据的差异进行了大量的研究，并提出了许多缓解偏差的方法。然而，关于图的不公平性的研究还处于早期阶段，而在现实应用中，图数据往往代表人与人之间的联系，容易产生公平性问题，应引起高度重视。公平表示学习是消除偏见最有效的方法之一，其目的是在模糊敏感信息的同时生成输入数据的隐藏表示。在图设置中，学习图的公平表示(也称为公平图嵌入)是解决图不公平问题的有效方法。然而，大多数现有的公平图嵌入工作只研究了粗粒度的公平(即群体公平)，而忽略了个体公平。本文从不同层次研究了公平图表示。具体来说，我们同时考虑了图上的群体公平和个人公平。为了消除图嵌入的偏见，我们提出了FairGAE，一个公平图自编码器模型，基于定制设计的公平图卷积网络(GCN)层来导出无偏图嵌入。然后，为了实现多级公平性，我们设计了一个基于Wasserstein距离的正则化器来学习更公平嵌入的最优传输。为了克服效率问题，我们进一步提出了Sinkhorn散度作为计算Wasserstein代价的近似值。最后，我们将学习到的无偏嵌入应用到节点分类任务中，并在两个真实世界的图数据集上进行了广泛的实验，以证明我们方法的改进性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Fair Graph Auto-Encoder for Unbiased Graph Representations with Wasserstein Distance

The fairness issue is very important in deploying machine learning models as algorithms widely used in human society can be easily in discrimination. Researchers have studied disparity on tabular data a lot and proposed many methods to relieve bias. However, studies towards unfairness in graph are still at early stage while graph data that often represent connections among people in real-world applications can easily give rise to fairness issues and thus should be attached to great importance. Fair representation learning is one of the most effective methods to relieve bias, which aims to generate hidden representations of input data while obfuscating sensitive information. In graph setting, learning fair representations of graph (also called fair graph embeddings) is effective to solve graph unfairness problems. However, most existing works of fair graph embeddings only study fairness in a coarse granularity (i.e., group fairness), but overlook individual fairness. In this paper, we study fair graph representations from different levels. Specifically, we consider both group fairness and individual fairness on graph. To debias graph embeddings, we propose FairGAE, a fair graph auto-encoder model, to derive unbiased graph embeddings based on the tailor-designed fair Graph Convolution Network (GCN) layers. Then, to achieve multi-level fairness, we design a Wasserstein distance based regularizer to learn the optimal transport for fairer embeddings. To overcome the efficiency concern, we further bring up Sinkhorn divergence as the approximations of Wasserstein cost for computation. Finally, we apply the learned unbiased embeddings into the node classification task and conduct extensive experiments on two real-world graph datasets to demonstrate the improved performances of our approach.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 IEEE International Conference on Data Mining (ICDM)

自引率

0.00%

发文量