使用实例级解释的二进制分类器可视化诊断工作流

2017 IEEE Conference on Visual Analytics Science and Technology (VAST) Pub Date : 2017-05-04 DOI:10.1109/VAST.2017.8585720

Josua Krause, Aritra Dasgupta, Jordan Swartz, Yindalon Aphinyanagphongs, E. Bertini

{"title":"使用实例级解释的二进制分类器可视化诊断工作流","authors":"Josua Krause, Aritra Dasgupta, Jordan Swartz, Yindalon Aphinyanagphongs, E. Bertini","doi":"10.1109/VAST.2017.8585720","DOIUrl":null,"url":null,"abstract":"Human-in-the-loop data analysis applications necessitate greater transparency in machine learning models for experts to understand and trust their decisions. To this end, we propose a visual analytics workflow to help data scientists and domain experts explore, diagnose, and understand the decisions made by a binary classifier. The approach leverages “instance-level explanations”, measures of local feature relevance that explain single instances, and uses them to build a set of visual representations that guide the users in their investigation. The workflow is based on three main visual representations and steps: one based on aggregate statistics to see how data distributes across correct / incorrect decisions; one based on explanations to understand which features are used to make these decisions; and one based on raw data, to derive insights on potential root causes for the observed patterns. The workflow is derived from a long-term collaboration with a group of machine learning and healthcare professionals who used our method to make sense of machine learning models they developed. The case study from this collaboration demonstrates that the proposed workflow helps experts derive useful knowledge about the model and the phenomena it describes, thus experts can generate useful hypotheses on how a model can be improved.","PeriodicalId":149607,"journal":{"name":"2017 IEEE Conference on Visual Analytics Science and Technology (VAST)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"87","resultStr":"{\"title\":\"A Workflow for Visual Diagnostics of Binary Classifiers using Instance-Level Explanations\",\"authors\":\"Josua Krause, Aritra Dasgupta, Jordan Swartz, Yindalon Aphinyanagphongs, E. Bertini\",\"doi\":\"10.1109/VAST.2017.8585720\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Human-in-the-loop data analysis applications necessitate greater transparency in machine learning models for experts to understand and trust their decisions. To this end, we propose a visual analytics workflow to help data scientists and domain experts explore, diagnose, and understand the decisions made by a binary classifier. The approach leverages “instance-level explanations”, measures of local feature relevance that explain single instances, and uses them to build a set of visual representations that guide the users in their investigation. The workflow is based on three main visual representations and steps: one based on aggregate statistics to see how data distributes across correct / incorrect decisions; one based on explanations to understand which features are used to make these decisions; and one based on raw data, to derive insights on potential root causes for the observed patterns. The workflow is derived from a long-term collaboration with a group of machine learning and healthcare professionals who used our method to make sense of machine learning models they developed. The case study from this collaboration demonstrates that the proposed workflow helps experts derive useful knowledge about the model and the phenomena it describes, thus experts can generate useful hypotheses on how a model can be improved.\",\"PeriodicalId\":149607,\"journal\":{\"name\":\"2017 IEEE Conference on Visual Analytics Science and Technology (VAST)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-05-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"87\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE Conference on Visual Analytics Science and Technology (VAST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VAST.2017.8585720\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE Conference on Visual Analytics Science and Technology (VAST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VAST.2017.8585720","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 87

摘要

人在循环数据分析应用程序需要机器学习模型更大的透明度，以便专家理解和信任他们的决策。为此，我们提出了一个可视化的分析工作流程，以帮助数据科学家和领域专家探索、诊断和理解二元分类器做出的决策。该方法利用“实例级解释”，即解释单个实例的局部特征相关性的度量，并使用它们来构建一组可视化表示，以指导用户进行调查。工作流基于三个主要的可视化表示和步骤:一个基于汇总统计，查看数据如何分布在正确/不正确的决策中;一种是基于解释来理解哪些特征是用来做出这些决定的;另一种是基于原始数据，从观察到的模式的潜在根本原因中得出见解。该工作流程源自与一组机器学习和医疗保健专业人员的长期合作，他们使用我们的方法来理解他们开发的机器学习模型。该合作的案例研究表明，所建议的工作流可以帮助专家获得关于模型及其描述的现象的有用知识，因此专家可以就如何改进模型生成有用的假设。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Workflow for Visual Diagnostics of Binary Classifiers using Instance-Level Explanations

Human-in-the-loop data analysis applications necessitate greater transparency in machine learning models for experts to understand and trust their decisions. To this end, we propose a visual analytics workflow to help data scientists and domain experts explore, diagnose, and understand the decisions made by a binary classifier. The approach leverages “instance-level explanations”, measures of local feature relevance that explain single instances, and uses them to build a set of visual representations that guide the users in their investigation. The workflow is based on three main visual representations and steps: one based on aggregate statistics to see how data distributes across correct / incorrect decisions; one based on explanations to understand which features are used to make these decisions; and one based on raw data, to derive insights on potential root causes for the observed patterns. The workflow is derived from a long-term collaboration with a group of machine learning and healthcare professionals who used our method to make sense of machine learning models they developed. The case study from this collaboration demonstrates that the proposed workflow helps experts derive useful knowledge about the model and the phenomena it describes, thus experts can generate useful hypotheses on how a model can be improved.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 IEEE Conference on Visual Analytics Science and Technology (VAST)

自引率

0.00%

发文量