Contrastive Identification of Covariate Shift in Image Data

Matthew Lyle Olson, Thu Nguyen, Gaurav Dixit, Neale Ratzlaff, Weng-Keen Wong, Minsuk Kahng
{"title":"Contrastive Identification of Covariate Shift in Image Data","authors":"Matthew Lyle Olson, Thu Nguyen, Gaurav Dixit, Neale Ratzlaff, Weng-Keen Wong, Minsuk Kahng","doi":"10.1109/VIS49827.2021.9623289","DOIUrl":null,"url":null,"abstract":"Identifying covariate shift is crucial for making machine learning systems robust in the real world and for detecting training data biases that are not reflected in test data. However, detecting covariate shift is challenging, especially when the data consists of high-dimensional images, and when multiple types of localized covariate shift affect different subspaces of the data. Although automated techniques can be used to detect the existence of covariate shift, our goal is to help human users characterize the extent of covariate shift in large image datasets with interfaces that seamlessly integrate information obtained from the detection algorithms. In this paper, we design and evaluate a new visual interface that facilitates the comparison of the local distributions of training and test data. We conduct a quantitative user study on multi-attribute facial data to compare two different learned low-dimensional latent representations (pretrained ImageNet CNN vs. density ratio) and two user analytic workflows (nearest-neighbor vs. cluster-to-cluster). Our results indicate that the latent representation of our density ratio model, combined with a nearest-neighbor comparison, is the most effective at helping humans identify covariate shift.","PeriodicalId":387572,"journal":{"name":"2021 IEEE Visualization Conference (VIS)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Visualization Conference (VIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VIS49827.2021.9623289","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Identifying covariate shift is crucial for making machine learning systems robust in the real world and for detecting training data biases that are not reflected in test data. However, detecting covariate shift is challenging, especially when the data consists of high-dimensional images, and when multiple types of localized covariate shift affect different subspaces of the data. Although automated techniques can be used to detect the existence of covariate shift, our goal is to help human users characterize the extent of covariate shift in large image datasets with interfaces that seamlessly integrate information obtained from the detection algorithms. In this paper, we design and evaluate a new visual interface that facilitates the comparison of the local distributions of training and test data. We conduct a quantitative user study on multi-attribute facial data to compare two different learned low-dimensional latent representations (pretrained ImageNet CNN vs. density ratio) and two user analytic workflows (nearest-neighbor vs. cluster-to-cluster). Our results indicate that the latent representation of our density ratio model, combined with a nearest-neighbor comparison, is the most effective at helping humans identify covariate shift.
对比识别图像数据中的变量偏移
识别协变量偏移对于机器学习系统在现实世界中的鲁棒性以及检测未反映在测试数据中的训练数据偏差至关重要。然而,检测协变量偏移具有挑战性,尤其是当数据由高维图像组成,并且多种类型的局部协变量偏移会影响数据的不同子空间时。虽然可以使用自动化技术来检测是否存在协变偏移,但我们的目标是通过无缝集成从检测算法中获取的信息的界面,帮助人类用户确定大型图像数据集中协变偏移的程度。在本文中,我们设计并评估了一种新的可视化界面,它有助于比较训练数据和测试数据的局部分布。我们对多属性面部数据进行了定量用户研究,比较了两种不同的低维潜在表征(预训练的 ImageNet CNN 与密度比)和两种用户分析工作流程(最近邻与聚类到聚类)。我们的结果表明,密度比模型的潜表征与最近邻比较相结合,在帮助人类识别协变量偏移方面最为有效。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信