DualRadviz: Preserving Context between Classification Evaluation and Data Exploration with RadViz

2016 5th Brazilian Conference on Intelligent Systems (BRACIS) Pub Date : 2016-10-01 DOI:10.1109/BRACIS.2016.052

Igor Bueno Correa, A. Carvalho

{"title":"DualRadviz: Preserving Context between Classification Evaluation and Data Exploration with RadViz","authors":"Igor Bueno Correa, A. Carvalho","doi":"10.1109/BRACIS.2016.052","DOIUrl":null,"url":null,"abstract":"With today's flood of data coming from many types of sources, Machine Learning becomes increasingly important. Though, many times the use of Machine Learning is not enough to make sense of all this data. This makes visualization a very useful tool for Machine Learning practitioners and data analysts alike. Interactive visualization techniques can be very helpful by giving insight on the meaning of the output from classification tasks. This also applies to the data itself, as visualization can make some characteristics of the data become clear. Several bi-dimensional projection methods have been used to visualize data instances based on their attribute values. This visualization is more difficult when the instances have a large number of attributes. One of the visualization techniques that can deal with high dimensional data is Radial Coordinates Visualization (RadViz). RadViz can also be employed to visualize the performance of a probabilistic classifier, helping a user to find problematic instances that might have been misclassified. In this study, a new approach to use RadViz is proposed and investigated. The proposed approach combines the two aforementioned uses of RadViz (attribute-based data exploration and result exploration based on the output of probabilistic classification). For such, it approach provides an easy transition between the two types of visualization. This allows the context to be preserved, since the user can visually track the same data instance from one type of visualization to the other. In order to evaluate the proposed approach, a prototype, named DualRadviz, was implemented. On this prototype, in addition to RadViz, visualization by Parallel Coordinates is also provided, so that precise instance inspection can be performed, since, different from RadViz, Parallel coordinates visualization does not suffer from ambiguity. To illustrate the usefulness of the proposed method, a case study is presented.","PeriodicalId":183149,"journal":{"name":"2016 5th Brazilian Conference on Intelligent Systems (BRACIS)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 5th Brazilian Conference on Intelligent Systems (BRACIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BRACIS.2016.052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

With today's flood of data coming from many types of sources, Machine Learning becomes increasingly important. Though, many times the use of Machine Learning is not enough to make sense of all this data. This makes visualization a very useful tool for Machine Learning practitioners and data analysts alike. Interactive visualization techniques can be very helpful by giving insight on the meaning of the output from classification tasks. This also applies to the data itself, as visualization can make some characteristics of the data become clear. Several bi-dimensional projection methods have been used to visualize data instances based on their attribute values. This visualization is more difficult when the instances have a large number of attributes. One of the visualization techniques that can deal with high dimensional data is Radial Coordinates Visualization (RadViz). RadViz can also be employed to visualize the performance of a probabilistic classifier, helping a user to find problematic instances that might have been misclassified. In this study, a new approach to use RadViz is proposed and investigated. The proposed approach combines the two aforementioned uses of RadViz (attribute-based data exploration and result exploration based on the output of probabilistic classification). For such, it approach provides an easy transition between the two types of visualization. This allows the context to be preserved, since the user can visually track the same data instance from one type of visualization to the other. In order to evaluate the proposed approach, a prototype, named DualRadviz, was implemented. On this prototype, in addition to RadViz, visualization by Parallel Coordinates is also provided, so that precise instance inspection can be performed, since, different from RadViz, Parallel coordinates visualization does not suffer from ambiguity. To illustrate the usefulness of the proposed method, a case study is presented.

查看原文本刊更多论文

DualRadviz:基于RadViz的分类评估和数据探索之间的上下文保存

随着今天来自各种来源的大量数据，机器学习变得越来越重要。然而，很多时候，机器学习的使用不足以理解所有这些数据。这使得可视化成为机器学习从业者和数据分析师非常有用的工具。交互式可视化技术非常有帮助，因为它可以洞察分类任务输出的含义。这也适用于数据本身，因为可视化可以使数据的某些特征变得清晰。已经使用了几种二维投影方法来基于属性值对数据实例进行可视化。当实例具有大量属性时，这种可视化更加困难。一种可以处理高维数据的可视化技术是径向坐标可视化(RadViz)。RadViz还可以用来可视化概率分类器的性能，帮助用户找到可能被错误分类的有问题的实例。在这项研究中，提出并研究了一种新的使用RadViz的方法。该方法结合了前面提到的RadViz的两种用途(基于属性的数据探索和基于概率分类输出的结果探索)。对于这种情况，它提供了两种可视化类型之间的简单转换。这样可以保留上下文，因为用户可以从一种可视化类型到另一种可视化类型可视化地跟踪相同的数据实例。为了评估所提出的方法，实现了一个名为DualRadviz的原型。在这个原型上，除了RadViz之外，还提供了并行坐标的可视化，以便可以执行精确的实例检查，因为与RadViz不同，并行坐标的可视化没有歧义。为了说明所提出的方法的有效性，给出了一个案例研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 5th Brazilian Conference on Intelligent Systems (BRACIS)

自引率

0.00%

发文量