通过自动修复决策树的公平决策

2022 IEEE/ACM International Workshop on Equitable Data & Technology (FairWare) Pub Date : 2022-05-01 DOI:10.1145/3524491.3527306

Jiang Zhang, Ivan Beschastnikh, Sergey Mechtaev, Abhik Roychoudhury

{"title":"通过自动修复决策树的公平决策","authors":"Jiang Zhang, Ivan Beschastnikh, Sergey Mechtaev, Abhik Roychoudhury","doi":"10.1145/3524491.3527306","DOIUrl":null,"url":null,"abstract":"Data-driven decision-making allows more resource allocation tasks to be done by programs. Unfortunately, real-life training datasets may capture human biases, and the learned models can be unfair. To resolve this, one could either train a new, fair model from scratch or repair an existing unfair model. The former approach is liable for unbounded semantic difference, hence is unsuitable for social or legislative decisions. Meanwhile, the scalability of state-of-the-art model repair techniques is unsatisfactory. In this paper, we aim to automatically repair unfair decision models by converting any decision tree or random forest into a fair one with respect to a specific dataset and sensitive attributes. We built the FairRepair tool, inspired by automated program repair techniques for traditional programs. It uses a MaxSMT solver to decide which paths in the decision tree could be flipped or refined, with both fairness and semantic difference as hard constraints. Our approach is sound and complete, and the output repair always satisfies the desired fairness and semantic difference requirements. FairRepair is able to repair an unfair decision tree on the well-known COMPAS dataset [2] in 1 minute on average, achieving 90.3% fairness and only 2.3% semantic difference. We compared FairRepair with 4 state-of-the-art fairness learning algorithms [10, 13, 16, 18]. While achieving similar fairness by training new models, they incur 8.9% to 13.5% semantic difference. These results show that FairRepair is capable of repairing an unfair model while maintaining the accuracy and incurring small semantic difference. CCS CONCEPTS • Computing methodologies → Philosophical/theoretical foundations of artificial intelligence; • Social and professional topics → Race and ethnicity.","PeriodicalId":287874,"journal":{"name":"2022 IEEE/ACM International Workshop on Equitable Data & Technology (FairWare)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Fair Decision Making via Automated Repair of Decision Trees\",\"authors\":\"Jiang Zhang, Ivan Beschastnikh, Sergey Mechtaev, Abhik Roychoudhury\",\"doi\":\"10.1145/3524491.3527306\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data-driven decision-making allows more resource allocation tasks to be done by programs. Unfortunately, real-life training datasets may capture human biases, and the learned models can be unfair. To resolve this, one could either train a new, fair model from scratch or repair an existing unfair model. The former approach is liable for unbounded semantic difference, hence is unsuitable for social or legislative decisions. Meanwhile, the scalability of state-of-the-art model repair techniques is unsatisfactory. In this paper, we aim to automatically repair unfair decision models by converting any decision tree or random forest into a fair one with respect to a specific dataset and sensitive attributes. We built the FairRepair tool, inspired by automated program repair techniques for traditional programs. It uses a MaxSMT solver to decide which paths in the decision tree could be flipped or refined, with both fairness and semantic difference as hard constraints. Our approach is sound and complete, and the output repair always satisfies the desired fairness and semantic difference requirements. FairRepair is able to repair an unfair decision tree on the well-known COMPAS dataset [2] in 1 minute on average, achieving 90.3% fairness and only 2.3% semantic difference. We compared FairRepair with 4 state-of-the-art fairness learning algorithms [10, 13, 16, 18]. While achieving similar fairness by training new models, they incur 8.9% to 13.5% semantic difference. These results show that FairRepair is capable of repairing an unfair model while maintaining the accuracy and incurring small semantic difference. CCS CONCEPTS • Computing methodologies → Philosophical/theoretical foundations of artificial intelligence; • Social and professional topics → Race and ethnicity.\",\"PeriodicalId\":287874,\"journal\":{\"name\":\"2022 IEEE/ACM International Workshop on Equitable Data & Technology (FairWare)\",\"volume\":\"104 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/ACM International Workshop on Equitable Data & Technology (FairWare)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3524491.3527306\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM International Workshop on Equitable Data & Technology (FairWare)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3524491.3527306","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

数据驱动的决策允许程序完成更多的资源分配任务。不幸的是，现实生活中的训练数据集可能会捕捉到人类的偏见，而学习的模型可能是不公平的。要解决这个问题，要么从头开始训练一个新的、公平的模型，要么修复一个现有的不公平模型。前一种方法容易产生无界语义差异，因此不适合社会或立法决策。同时，最先进的模型修复技术的可扩展性令人不满意。在本文中，我们的目标是通过将任意决策树或随机森林转换为相对于特定数据集和敏感属性的公平决策树来自动修复不公平决策模型。受传统程序的自动程序修复技术的启发，我们构建了FairRepair工具。它使用MaxSMT求解器来决定决策树中的哪些路径可以翻转或改进，公平性和语义差异作为硬约束。我们的方法是健全和完整的，输出修复总是满足期望的公平性和语义差异的要求。FairRepair能够在平均1分钟内修复COMPAS数据集[2]上的不公平决策树，实现90.3%的公平性和仅2.3%的语义差异。我们将FairRepair与4种最先进的公平性学习算法[10,13,16,18]进行了比较。虽然通过训练新模型实现了类似的公平性，但它们会产生8.9%到13.5%的语义差异。这些结果表明，FairRepair能够在保持准确性和产生较小语义差异的情况下修复不公平模型。•计算方法→人工智能的哲学/理论基础;•社会和专业话题→种族和民族。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Fair Decision Making via Automated Repair of Decision Trees

Data-driven decision-making allows more resource allocation tasks to be done by programs. Unfortunately, real-life training datasets may capture human biases, and the learned models can be unfair. To resolve this, one could either train a new, fair model from scratch or repair an existing unfair model. The former approach is liable for unbounded semantic difference, hence is unsuitable for social or legislative decisions. Meanwhile, the scalability of state-of-the-art model repair techniques is unsatisfactory. In this paper, we aim to automatically repair unfair decision models by converting any decision tree or random forest into a fair one with respect to a specific dataset and sensitive attributes. We built the FairRepair tool, inspired by automated program repair techniques for traditional programs. It uses a MaxSMT solver to decide which paths in the decision tree could be flipped or refined, with both fairness and semantic difference as hard constraints. Our approach is sound and complete, and the output repair always satisfies the desired fairness and semantic difference requirements. FairRepair is able to repair an unfair decision tree on the well-known COMPAS dataset [2] in 1 minute on average, achieving 90.3% fairness and only 2.3% semantic difference. We compared FairRepair with 4 state-of-the-art fairness learning algorithms [10, 13, 16, 18]. While achieving similar fairness by training new models, they incur 8.9% to 13.5% semantic difference. These results show that FairRepair is capable of repairing an unfair model while maintaining the accuracy and incurring small semantic difference. CCS CONCEPTS • Computing methodologies → Philosophical/theoretical foundations of artificial intelligence; • Social and professional topics → Race and ethnicity.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE/ACM International Workshop on Equitable Data & Technology (FairWare)

自引率

0.00%

发文量