基于一般线坐标的决策树可视化以支持可解释模型

2022 26th International Conference Information Visualisation (IV) Pub Date : 2022-05-09 DOI:10.1109/IV56949.2022.00065

Alex Worland, S. Wagle, B. Kovalerchuk

{"title":"基于一般线坐标的决策树可视化以支持可解释模型","authors":"Alex Worland, S. Wagle, B. Kovalerchuk","doi":"10.1109/IV56949.2022.00065","DOIUrl":null,"url":null,"abstract":"Visualization of Machine Learning (ML) models is an important part of the ML process to enhance the interpretability and prediction accuracy of the ML models. This paper proposes a new method SPC-DT to visualize the Decision Tree (DT) as interpretable models. These methods use a version of General Line Coordinates called Shifted Paired Coordinates (SPC). In SPC, each n-D point is visualized in a set of shifted pairs of 2-D Cartesian coordinates as a directed graph. The new method expands and complements the capabilities of existing methods, to visualize DT models. It shows: (1) relations between attributes, (2) individual cases relative to the DT structure, (3) data flow in the DT, (4) how tight each split is to thresholds in the DT nodes, and (5) the density of cases in parts of the n-D space. This information is important for domain experts for evaluating and improving the DT models, including avoiding overgeneralization and overfitting of models, along with their performance. The benefits of the methods are demonstrated in the case studies, using three standard benchmarks.","PeriodicalId":153161,"journal":{"name":"2022 26th International Conference Information Visualisation (IV)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Visualization of Decision Trees based on General Line Coordinates to Support Explainable Models\",\"authors\":\"Alex Worland, S. Wagle, B. Kovalerchuk\",\"doi\":\"10.1109/IV56949.2022.00065\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Visualization of Machine Learning (ML) models is an important part of the ML process to enhance the interpretability and prediction accuracy of the ML models. This paper proposes a new method SPC-DT to visualize the Decision Tree (DT) as interpretable models. These methods use a version of General Line Coordinates called Shifted Paired Coordinates (SPC). In SPC, each n-D point is visualized in a set of shifted pairs of 2-D Cartesian coordinates as a directed graph. The new method expands and complements the capabilities of existing methods, to visualize DT models. It shows: (1) relations between attributes, (2) individual cases relative to the DT structure, (3) data flow in the DT, (4) how tight each split is to thresholds in the DT nodes, and (5) the density of cases in parts of the n-D space. This information is important for domain experts for evaluating and improving the DT models, including avoiding overgeneralization and overfitting of models, along with their performance. The benefits of the methods are demonstrated in the case studies, using three standard benchmarks.\",\"PeriodicalId\":153161,\"journal\":{\"name\":\"2022 26th International Conference Information Visualisation (IV)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 26th International Conference Information Visualisation (IV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IV56949.2022.00065\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 26th International Conference Information Visualisation (IV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IV56949.2022.00065","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

机器学习模型可视化是机器学习过程中提高模型可解释性和预测精度的重要组成部分。本文提出了一种将决策树可视化为可解释模型的新方法SPC-DT。这些方法使用一种称为移位配对坐标(SPC)的通用直线坐标。在SPC中，每个n-D点在一组位移的二维笛卡尔坐标对中被可视化为一个有向图。新方法扩展并补充了现有方法的功能，以可视化DT模型。它表明:(1)属性之间的关系，(2)相对于DT结构的个别情况，(3)DT中的数据流，(4)每次分割与DT节点中的阈值的紧密程度，以及(5)n-D空间中部分情况的密度。这些信息对于领域专家评估和改进DT模型非常重要，包括避免模型的过度泛化和过度拟合，以及它们的性能。使用三个标准基准，在案例研究中展示了这些方法的好处。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Visualization of Decision Trees based on General Line Coordinates to Support Explainable Models

Visualization of Machine Learning (ML) models is an important part of the ML process to enhance the interpretability and prediction accuracy of the ML models. This paper proposes a new method SPC-DT to visualize the Decision Tree (DT) as interpretable models. These methods use a version of General Line Coordinates called Shifted Paired Coordinates (SPC). In SPC, each n-D point is visualized in a set of shifted pairs of 2-D Cartesian coordinates as a directed graph. The new method expands and complements the capabilities of existing methods, to visualize DT models. It shows: (1) relations between attributes, (2) individual cases relative to the DT structure, (3) data flow in the DT, (4) how tight each split is to thresholds in the DT nodes, and (5) the density of cases in parts of the n-D space. This information is important for domain experts for evaluating and improving the DT models, including avoiding overgeneralization and overfitting of models, along with their performance. The benefits of the methods are demonstrated in the case studies, using three standard benchmarks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 26th International Conference Information Visualisation (IV)

自引率

0.00%

发文量