Brittany Johnson , Jesse Bartola , Rico Angell , Sam Witty , Stephen Giguere , Yuriy Brun
{"title":"Fairkit, fairkit, on the wall, who’s the fairest of them all? Supporting fairness-related decision-making","authors":"Brittany Johnson , Jesse Bartola , Rico Angell , Sam Witty , Stephen Giguere , Yuriy Brun","doi":"10.1016/j.ejdp.2023.100031","DOIUrl":null,"url":null,"abstract":"<div><p>Modern software relies heavily on data and machine learning, and affects decisions that shape our world. Unfortunately, recent studies have shown that because of biases in data, software systems frequently inject bias into their decisions, from producing more errors when transcribing women’s than men’s voices to overcharging people of color for financial loans. To address bias in software, data scientists and software engineers need tools that help them understand the trade-offs between model quality and fairness in their specific data domains. Toward that end, we present fairkit-learn, an interactive toolkit for helping engineers reason about and understand fairness. Fairkit-learn supports over 70 definition of fairness and works with state-of-the-art machine learning tools, using the same interfaces to ease adoption. It can evaluate thousands of models produced by multiple machine learning algorithms, hyperparameters, and data permutations, and compute and visualize a small Pareto-optimal set of models that describe the optimal trade-offs between fairness and quality. Engineers can then iterate, improving their models and evaluating them using fairkit-learn. We evaluate fairkit-learn via a user study with 54 students, showing that students using fairkit-learn produce models that provide a better balance between fairness and quality than students using scikit-learn and IBM AI Fairness 360 toolkits. With fairkit-learn, users can select models that are up to 67% more fair and 10% more accurate than the models they are likely to train with scikit-learn.</p></div>","PeriodicalId":44104,"journal":{"name":"EURO Journal on Decision Processes","volume":null,"pages":null},"PeriodicalIF":2.3000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"EURO Journal on Decision Processes","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2193943823000043","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MANAGEMENT","Score":null,"Total":0}
引用次数: 1
Abstract
Modern software relies heavily on data and machine learning, and affects decisions that shape our world. Unfortunately, recent studies have shown that because of biases in data, software systems frequently inject bias into their decisions, from producing more errors when transcribing women’s than men’s voices to overcharging people of color for financial loans. To address bias in software, data scientists and software engineers need tools that help them understand the trade-offs between model quality and fairness in their specific data domains. Toward that end, we present fairkit-learn, an interactive toolkit for helping engineers reason about and understand fairness. Fairkit-learn supports over 70 definition of fairness and works with state-of-the-art machine learning tools, using the same interfaces to ease adoption. It can evaluate thousands of models produced by multiple machine learning algorithms, hyperparameters, and data permutations, and compute and visualize a small Pareto-optimal set of models that describe the optimal trade-offs between fairness and quality. Engineers can then iterate, improving their models and evaluating them using fairkit-learn. We evaluate fairkit-learn via a user study with 54 students, showing that students using fairkit-learn produce models that provide a better balance between fairness and quality than students using scikit-learn and IBM AI Fairness 360 toolkits. With fairkit-learn, users can select models that are up to 67% more fair and 10% more accurate than the models they are likely to train with scikit-learn.