使用实例级解释的二进制分类器可视化诊断工作流

Josua Krause, Aritra Dasgupta, Jordan Swartz, Yindalon Aphinyanagphongs, E. Bertini
{"title":"使用实例级解释的二进制分类器可视化诊断工作流","authors":"Josua Krause, Aritra Dasgupta, Jordan Swartz, Yindalon Aphinyanagphongs, E. Bertini","doi":"10.1109/VAST.2017.8585720","DOIUrl":null,"url":null,"abstract":"Human-in-the-loop data analysis applications necessitate greater transparency in machine learning models for experts to understand and trust their decisions. To this end, we propose a visual analytics workflow to help data scientists and domain experts explore, diagnose, and understand the decisions made by a binary classifier. The approach leverages “instance-level explanations”, measures of local feature relevance that explain single instances, and uses them to build a set of visual representations that guide the users in their investigation. The workflow is based on three main visual representations and steps: one based on aggregate statistics to see how data distributes across correct / incorrect decisions; one based on explanations to understand which features are used to make these decisions; and one based on raw data, to derive insights on potential root causes for the observed patterns. The workflow is derived from a long-term collaboration with a group of machine learning and healthcare professionals who used our method to make sense of machine learning models they developed. The case study from this collaboration demonstrates that the proposed workflow helps experts derive useful knowledge about the model and the phenomena it describes, thus experts can generate useful hypotheses on how a model can be improved.","PeriodicalId":149607,"journal":{"name":"2017 IEEE Conference on Visual Analytics Science and Technology (VAST)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"87","resultStr":"{\"title\":\"A Workflow for Visual Diagnostics of Binary Classifiers using Instance-Level Explanations\",\"authors\":\"Josua Krause, Aritra Dasgupta, Jordan Swartz, Yindalon Aphinyanagphongs, E. Bertini\",\"doi\":\"10.1109/VAST.2017.8585720\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Human-in-the-loop data analysis applications necessitate greater transparency in machine learning models for experts to understand and trust their decisions. To this end, we propose a visual analytics workflow to help data scientists and domain experts explore, diagnose, and understand the decisions made by a binary classifier. The approach leverages “instance-level explanations”, measures of local feature relevance that explain single instances, and uses them to build a set of visual representations that guide the users in their investigation. The workflow is based on three main visual representations and steps: one based on aggregate statistics to see how data distributes across correct / incorrect decisions; one based on explanations to understand which features are used to make these decisions; and one based on raw data, to derive insights on potential root causes for the observed patterns. The workflow is derived from a long-term collaboration with a group of machine learning and healthcare professionals who used our method to make sense of machine learning models they developed. The case study from this collaboration demonstrates that the proposed workflow helps experts derive useful knowledge about the model and the phenomena it describes, thus experts can generate useful hypotheses on how a model can be improved.\",\"PeriodicalId\":149607,\"journal\":{\"name\":\"2017 IEEE Conference on Visual Analytics Science and Technology (VAST)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-05-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"87\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE Conference on Visual Analytics Science and Technology (VAST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VAST.2017.8585720\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE Conference on Visual Analytics Science and Technology (VAST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VAST.2017.8585720","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 87

摘要

人在循环数据分析应用程序需要机器学习模型更大的透明度,以便专家理解和信任他们的决策。为此,我们提出了一个可视化的分析工作流程,以帮助数据科学家和领域专家探索、诊断和理解二元分类器做出的决策。该方法利用“实例级解释”,即解释单个实例的局部特征相关性的度量,并使用它们来构建一组可视化表示,以指导用户进行调查。工作流基于三个主要的可视化表示和步骤:一个基于汇总统计,查看数据如何分布在正确/不正确的决策中;一种是基于解释来理解哪些特征是用来做出这些决定的;另一种是基于原始数据,从观察到的模式的潜在根本原因中得出见解。该工作流程源自与一组机器学习和医疗保健专业人员的长期合作,他们使用我们的方法来理解他们开发的机器学习模型。该合作的案例研究表明,所建议的工作流可以帮助专家获得关于模型及其描述的现象的有用知识,因此专家可以就如何改进模型生成有用的假设。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Workflow for Visual Diagnostics of Binary Classifiers using Instance-Level Explanations
Human-in-the-loop data analysis applications necessitate greater transparency in machine learning models for experts to understand and trust their decisions. To this end, we propose a visual analytics workflow to help data scientists and domain experts explore, diagnose, and understand the decisions made by a binary classifier. The approach leverages “instance-level explanations”, measures of local feature relevance that explain single instances, and uses them to build a set of visual representations that guide the users in their investigation. The workflow is based on three main visual representations and steps: one based on aggregate statistics to see how data distributes across correct / incorrect decisions; one based on explanations to understand which features are used to make these decisions; and one based on raw data, to derive insights on potential root causes for the observed patterns. The workflow is derived from a long-term collaboration with a group of machine learning and healthcare professionals who used our method to make sense of machine learning models they developed. The case study from this collaboration demonstrates that the proposed workflow helps experts derive useful knowledge about the model and the phenomena it describes, thus experts can generate useful hypotheses on how a model can be improved.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信