Rareş Constantin, Moritz Dück, Anton Alexandrov, Patrik Matošević, Daphna Keidar, Mennatallah El-Assady
{"title":"算法公平指标如何与人类判断保持一致?情境化公平评估的混合主动系统","authors":"Rareş Constantin, Moritz Dück, Anton Alexandrov, Patrik Matošević, Daphna Keidar, Mennatallah El-Assady","doi":"10.1109/TREX57753.2022.00005","DOIUrl":null,"url":null,"abstract":"Fairness evaluation presents a challenging problem in machine learning, and is usually restricted to the exploration of various metrics that attempt to quantify algorithmic fairness. However, due to cultural and perceptual biases, such metrics are often not powerful enough to accurately capture what people perceive as fair or unfair. To close the gap between human judgement and automated fairness evaluation, we develop a mixed-initiative system named FairAlign, where laypeople assess the fairness of different classification models by analyzing expressive and interactive visualizations of data. Using the aggregated qualitative feedback, data scientists and machine learning experts can examine the similarities and the differences between predefined fairness metrics and human judgement in a contextualized setting. To validate the utility of our system, we conducted a small study on a socially relevant classification task, where six people were asked to assess the fairness of multiple prediction models using the provided visualizations. The results show that our platform is able to give valuable guidance for model evaluation in case of otherwise contradicting and indecisive metrics for algorithmic fairness.","PeriodicalId":150871,"journal":{"name":"2022 IEEE Workshop on TRust and EXpertise in Visual Analytics (TREX)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"How Do Algorithmic Fairness Metrics Align with Human Judgement? A Mixed-Initiative System for Contextualized Fairness Assessment\",\"authors\":\"Rareş Constantin, Moritz Dück, Anton Alexandrov, Patrik Matošević, Daphna Keidar, Mennatallah El-Assady\",\"doi\":\"10.1109/TREX57753.2022.00005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Fairness evaluation presents a challenging problem in machine learning, and is usually restricted to the exploration of various metrics that attempt to quantify algorithmic fairness. However, due to cultural and perceptual biases, such metrics are often not powerful enough to accurately capture what people perceive as fair or unfair. To close the gap between human judgement and automated fairness evaluation, we develop a mixed-initiative system named FairAlign, where laypeople assess the fairness of different classification models by analyzing expressive and interactive visualizations of data. Using the aggregated qualitative feedback, data scientists and machine learning experts can examine the similarities and the differences between predefined fairness metrics and human judgement in a contextualized setting. To validate the utility of our system, we conducted a small study on a socially relevant classification task, where six people were asked to assess the fairness of multiple prediction models using the provided visualizations. The results show that our platform is able to give valuable guidance for model evaluation in case of otherwise contradicting and indecisive metrics for algorithmic fairness.\",\"PeriodicalId\":150871,\"journal\":{\"name\":\"2022 IEEE Workshop on TRust and EXpertise in Visual Analytics (TREX)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE Workshop on TRust and EXpertise in Visual Analytics (TREX)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TREX57753.2022.00005\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Workshop on TRust and EXpertise in Visual Analytics (TREX)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TREX57753.2022.00005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
How Do Algorithmic Fairness Metrics Align with Human Judgement? A Mixed-Initiative System for Contextualized Fairness Assessment
Fairness evaluation presents a challenging problem in machine learning, and is usually restricted to the exploration of various metrics that attempt to quantify algorithmic fairness. However, due to cultural and perceptual biases, such metrics are often not powerful enough to accurately capture what people perceive as fair or unfair. To close the gap between human judgement and automated fairness evaluation, we develop a mixed-initiative system named FairAlign, where laypeople assess the fairness of different classification models by analyzing expressive and interactive visualizations of data. Using the aggregated qualitative feedback, data scientists and machine learning experts can examine the similarities and the differences between predefined fairness metrics and human judgement in a contextualized setting. To validate the utility of our system, we conducted a small study on a socially relevant classification task, where six people were asked to assess the fairness of multiple prediction models using the provided visualizations. The results show that our platform is able to give valuable guidance for model evaluation in case of otherwise contradicting and indecisive metrics for algorithmic fairness.