{"title":"Prune bias from the root: Bias removal and fairness estimation by pruning sensitive attributes in pre-trained DNN models","authors":"Qiaolin Qin, Ettore Merlo","doi":"10.1016/j.infsof.2025.107906","DOIUrl":null,"url":null,"abstract":"<div><h3>Context:</h3><div>Deep learning models (DNNs) are widely used in high-stakes decision-making domains, but they often inherit and amplify biases present in training data, leading to unfair predictions. Given this context, fairness estimation metrics and bias removal methods are required to select and enhance fair models. However, we found that existing metrics lack robustness in estimating multi-attribute group fairness. Further, existing post-processing bias removal methods often focus on group fairness and fail to address individual fairness or optimize along multiple sensitive attributes.</div></div><div><h3>Objective:</h3><div>In this study, we explore the effectiveness of attribute pruning (i.e., zeroing out sensitive attribute weights in a pre-trained DNN’s input layer) in both bias removal and multi-attribute group fairness estimation.</div></div><div><h3>Methods:</h3><div>To study attribute pruning’s impact on bias removal, we conducted experiments on 32 models and 4 widely used datasets, and compared its effect in single-attribute group bias removal and accuracy preservation with 3 baseline post-processing methods. We then leveraged 3 datasets with multiple sensitive attributes to demonstrate how to use attribute pruning for multi-attribute group fairness estimation.</div></div><div><h3>Results:</h3><div>Single-attribute pruning can better preserve model accuracy than conventional post-processing methods in 23 out of 32 cases, and enforces individual fairness by design. However, since individual fairness and group fairness are fundamentally different objectives, attribute pruning’s effect on group fairness metrics is often inconsistent. We also extend our approach to a multi-attribute setting, demonstrating its potential for improving individual fairness jointly across sensitive attributes and for enabling multi-attribute fairness-aware model selection.</div></div><div><h3>Conclusion:</h3><div>Attribute pruning is a practical post-processing approach for enforcing individual fairness, with limited and data-dependent impact on group fairness. These limitations reflect the inherent trade-off between individual and group fairness objectives. In addition, attribute pruning provides a useful mechanism for bias estimation, particularly in multi-attribute contexts. We advocate for its adoption as a comparison baseline in fairness-aware AI development and encourage further exploration.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"189 ","pages":"Article 107906"},"PeriodicalIF":4.3000,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Software Technology","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950584925002459","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Context:
Deep learning models (DNNs) are widely used in high-stakes decision-making domains, but they often inherit and amplify biases present in training data, leading to unfair predictions. Given this context, fairness estimation metrics and bias removal methods are required to select and enhance fair models. However, we found that existing metrics lack robustness in estimating multi-attribute group fairness. Further, existing post-processing bias removal methods often focus on group fairness and fail to address individual fairness or optimize along multiple sensitive attributes.
Objective:
In this study, we explore the effectiveness of attribute pruning (i.e., zeroing out sensitive attribute weights in a pre-trained DNN’s input layer) in both bias removal and multi-attribute group fairness estimation.
Methods:
To study attribute pruning’s impact on bias removal, we conducted experiments on 32 models and 4 widely used datasets, and compared its effect in single-attribute group bias removal and accuracy preservation with 3 baseline post-processing methods. We then leveraged 3 datasets with multiple sensitive attributes to demonstrate how to use attribute pruning for multi-attribute group fairness estimation.
Results:
Single-attribute pruning can better preserve model accuracy than conventional post-processing methods in 23 out of 32 cases, and enforces individual fairness by design. However, since individual fairness and group fairness are fundamentally different objectives, attribute pruning’s effect on group fairness metrics is often inconsistent. We also extend our approach to a multi-attribute setting, demonstrating its potential for improving individual fairness jointly across sensitive attributes and for enabling multi-attribute fairness-aware model selection.
Conclusion:
Attribute pruning is a practical post-processing approach for enforcing individual fairness, with limited and data-dependent impact on group fairness. These limitations reflect the inherent trade-off between individual and group fairness objectives. In addition, attribute pruning provides a useful mechanism for bias estimation, particularly in multi-attribute contexts. We advocate for its adoption as a comparison baseline in fairness-aware AI development and encourage further exploration.
期刊介绍:
Information and Software Technology is the international archival journal focusing on research and experience that contributes to the improvement of software development practices. The journal''s scope includes methods and techniques to better engineer software and manage its development. Articles submitted for review should have a clear component of software engineering or address ways to improve the engineering and management of software development. Areas covered by the journal include:
• Software management, quality and metrics,
• Software processes,
• Software architecture, modelling, specification, design and programming
• Functional and non-functional software requirements
• Software testing and verification & validation
• Empirical studies of all aspects of engineering and managing software development
Short Communications is a new section dedicated to short papers addressing new ideas, controversial opinions, "Negative" results and much more. Read the Guide for authors for more information.
The journal encourages and welcomes submissions of systematic literature studies (reviews and maps) within the scope of the journal. Information and Software Technology is the premiere outlet for systematic literature studies in software engineering.