Falaah Arif Khan, Denys Herasymuk, Nazar Protsiv, Julia Stoyanovich
{"title":"更多的 \"空\":负责任的缺失值估算基准","authors":"Falaah Arif Khan, Denys Herasymuk, Nazar Protsiv, Julia Stoyanovich","doi":"arxiv-2409.07510","DOIUrl":null,"url":null,"abstract":"We present Shades-of-NULL, a benchmark for responsible missing value\nimputation. Our benchmark includes state-of-the-art imputation techniques, and\nembeds them into the machine learning development lifecycle. We model realistic\nmissingness scenarios that go beyond Rubin's classic Missing Completely at\nRandom (MCAR), Missing At Random (MAR) and Missing Not At Random (MNAR), to\ninclude multi-mechanism missingness (when different missingness patterns\nco-exist in the data) and missingness shift (when the missingness mechanism\nchanges between training and test). Another key novelty of our work is that we\nevaluate imputers holistically, based on the predictive performance, fairness\nand stability of the models that are trained and tested on the data they\nproduce. We use Shades-of-NULL to conduct a large-scale empirical study involving\n20,952 experimental pipelines, and find that, while there is no single\nbest-performing imputation approach for all missingness types, interesting\nperformance patterns do emerge when comparing imputer performance in simpler\nvs. more complex missingness scenarios. Further, while predictive performance,\nfairness and stability can be seen as orthogonal, we identify trade-offs among\nthem that arise due to the combination of missingness scenario, the choice of\nan imputer, and the architecture of the model trained on the data\npost-imputation. We make Shades-of-NULL publicly available, and hope to enable\nresearchers to comprehensively and rigorously evaluate new missing value\nimputation methods on a wide range of evaluation metrics, in plausible and\nsocially meaningful missingness scenarios.","PeriodicalId":501112,"journal":{"name":"arXiv - CS - Computers and Society","volume":"26 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Still More Shades of Null: A Benchmark for Responsible Missing Value Imputation\",\"authors\":\"Falaah Arif Khan, Denys Herasymuk, Nazar Protsiv, Julia Stoyanovich\",\"doi\":\"arxiv-2409.07510\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present Shades-of-NULL, a benchmark for responsible missing value\\nimputation. Our benchmark includes state-of-the-art imputation techniques, and\\nembeds them into the machine learning development lifecycle. We model realistic\\nmissingness scenarios that go beyond Rubin's classic Missing Completely at\\nRandom (MCAR), Missing At Random (MAR) and Missing Not At Random (MNAR), to\\ninclude multi-mechanism missingness (when different missingness patterns\\nco-exist in the data) and missingness shift (when the missingness mechanism\\nchanges between training and test). Another key novelty of our work is that we\\nevaluate imputers holistically, based on the predictive performance, fairness\\nand stability of the models that are trained and tested on the data they\\nproduce. We use Shades-of-NULL to conduct a large-scale empirical study involving\\n20,952 experimental pipelines, and find that, while there is no single\\nbest-performing imputation approach for all missingness types, interesting\\nperformance patterns do emerge when comparing imputer performance in simpler\\nvs. more complex missingness scenarios. Further, while predictive performance,\\nfairness and stability can be seen as orthogonal, we identify trade-offs among\\nthem that arise due to the combination of missingness scenario, the choice of\\nan imputer, and the architecture of the model trained on the data\\npost-imputation. We make Shades-of-NULL publicly available, and hope to enable\\nresearchers to comprehensively and rigorously evaluate new missing value\\nimputation methods on a wide range of evaluation metrics, in plausible and\\nsocially meaningful missingness scenarios.\",\"PeriodicalId\":501112,\"journal\":{\"name\":\"arXiv - CS - Computers and Society\",\"volume\":\"26 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Computers and Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.07510\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computers and Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07510","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Still More Shades of Null: A Benchmark for Responsible Missing Value Imputation
We present Shades-of-NULL, a benchmark for responsible missing value
imputation. Our benchmark includes state-of-the-art imputation techniques, and
embeds them into the machine learning development lifecycle. We model realistic
missingness scenarios that go beyond Rubin's classic Missing Completely at
Random (MCAR), Missing At Random (MAR) and Missing Not At Random (MNAR), to
include multi-mechanism missingness (when different missingness patterns
co-exist in the data) and missingness shift (when the missingness mechanism
changes between training and test). Another key novelty of our work is that we
evaluate imputers holistically, based on the predictive performance, fairness
and stability of the models that are trained and tested on the data they
produce. We use Shades-of-NULL to conduct a large-scale empirical study involving
20,952 experimental pipelines, and find that, while there is no single
best-performing imputation approach for all missingness types, interesting
performance patterns do emerge when comparing imputer performance in simpler
vs. more complex missingness scenarios. Further, while predictive performance,
fairness and stability can be seen as orthogonal, we identify trade-offs among
them that arise due to the combination of missingness scenario, the choice of
an imputer, and the architecture of the model trained on the data
post-imputation. We make Shades-of-NULL publicly available, and hope to enable
researchers to comprehensively and rigorously evaluate new missing value
imputation methods on a wide range of evaluation metrics, in plausible and
socially meaningful missingness scenarios.