A Visualization Method for Training Data Comparison

2021 25th International Conference Information Visualisation (IV) Pub Date : 2021-07-01 DOI:10.1109/IV53921.2021.00040

Karen Kosaka, T. Itoh

引用次数: 0

Abstract

With the diversification of machine learning applications, the quality verification and comparison of training data has been an important process. For example, while performing transfer learning, verification the difference in the quality between the source and the target data can prevent the accuracy of the model from deteriorating. However, training datasets for deep learning is getting larger and larger, and analysis of such datasets is not always easy. As a solution to this problem, we are working on the visualization for training data validation. In this study, we apply dimensionality reduction to the training datasets and display them as scatterplots to realize a visual analysis that can easily detect differences in the quality. Our current implementation draws the regions where the points are concentrated as semitransparent polygons for each label in the scatterplot. Also, the implementation provides a slider to set a threshold for the interactive adjustment of polygon generation. This allows us to observe the differences in the distribution of labels among the training data.

查看原文本刊更多论文

训练数据比较的可视化方法

随着机器学习应用的多样化，训练数据的质量验证和比较已经成为一个重要的过程。例如，在进行迁移学习时，验证源数据和目标数据之间的质量差异可以防止模型的准确性下降。然而，深度学习的训练数据集越来越大，分析这些数据集并不总是那么容易。为了解决这个问题，我们正在研究训练数据验证的可视化。在本研究中，我们对训练数据集进行降维，并将其显示为散点图，以实现可以轻松检测质量差异的可视化分析。我们当前的实现将点集中的区域绘制为散点图中每个标签的半透明多边形。此外，该实现还提供了一个滑块来设置多边形生成的交互式调整阈值。这使我们能够观察到训练数据中标签分布的差异。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 25th International Conference Information Visualisation (IV)

自引率

0.00%

发文量