{"title":"FairMOE: counterfactually-fair mixture of experts with levels of interpretability","authors":"Joe Germino, Nuno Moniz, Nitesh V. Chawla","doi":"10.1007/s10994-024-06583-2","DOIUrl":null,"url":null,"abstract":"<p>With the rise of artificial intelligence in our everyday lives, the need for human interpretation of machine learning models’ predictions emerges as a critical issue. Generally, interpretability is viewed as a binary notion with a performance trade-off. Either a model is fully-interpretable but lacks the ability to capture more complex patterns in the data, or it is a black box. In this paper, we argue that this view is severely limiting and that instead interpretability should be viewed as a continuous domain-informed concept. We leverage the well-known Mixture of Experts architecture with user-defined limits on non-interpretability. We extend this idea with a counterfactual fairness module to ensure the selection of consistently <i>fair</i> experts: <b>FairMOE</b>. We perform an extensive experimental evaluation with fairness-related data sets and compare our proposal against state-of-the-art methods. Our results demonstrate that FairMOE is competitive with the leading fairness-aware algorithms in both fairness and predictive measures while providing more consistent performance, competitive scalability, and, most importantly, greater interpretability.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"29 1","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10994-024-06583-2","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
With the rise of artificial intelligence in our everyday lives, the need for human interpretation of machine learning models’ predictions emerges as a critical issue. Generally, interpretability is viewed as a binary notion with a performance trade-off. Either a model is fully-interpretable but lacks the ability to capture more complex patterns in the data, or it is a black box. In this paper, we argue that this view is severely limiting and that instead interpretability should be viewed as a continuous domain-informed concept. We leverage the well-known Mixture of Experts architecture with user-defined limits on non-interpretability. We extend this idea with a counterfactual fairness module to ensure the selection of consistently fair experts: FairMOE. We perform an extensive experimental evaluation with fairness-related data sets and compare our proposal against state-of-the-art methods. Our results demonstrate that FairMOE is competitive with the leading fairness-aware algorithms in both fairness and predictive measures while providing more consistent performance, competitive scalability, and, most importantly, greater interpretability.
期刊介绍:
Machine Learning serves as a global platform dedicated to computational approaches in learning. The journal reports substantial findings on diverse learning methods applied to various problems, offering support through empirical studies, theoretical analysis, or connections to psychological phenomena. It demonstrates the application of learning methods to solve significant problems and aims to enhance the conduct of machine learning research with a focus on verifiable and replicable evidence in published papers.