{"title":"PDPilot: Exploring Partial Dependence Plots Through Ranking, Filtering, and Clustering.","authors":"Daniel Kerrigan, Brian Barr, Enrico Bertini","doi":"10.1109/TVCG.2025.3545025","DOIUrl":null,"url":null,"abstract":"<p><p>Partial dependence plots (PDPs) and individual conditional expectation (ICE) plots are visualizations used for explaining the behavior of machine learning (ML) models trained on tabular datasets. They show how the values of a feature or pair of features impact a model's predictions. However, in models with a large number of features, it is impractical for an ML practitioner to analyze all possible plots. To address this, we present new techniques for ranking and filtering PDP and ICE plots and build upon existing strategies for clustering the lines in ICE plots. Together, these techniques aim to help ML practitioners efficiently explore PDP and ICE plots and identify interesting model behavior. We integrate these techniques into PDPilot, a visual analytics tool that runs in Jupyter notebooks. We use PDPilot to study how 7 ML practitioners utilize the ranking, filtering, and clustering techniques to analyze an ML model.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on visualization and computer graphics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TVCG.2025.3545025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Partial dependence plots (PDPs) and individual conditional expectation (ICE) plots are visualizations used for explaining the behavior of machine learning (ML) models trained on tabular datasets. They show how the values of a feature or pair of features impact a model's predictions. However, in models with a large number of features, it is impractical for an ML practitioner to analyze all possible plots. To address this, we present new techniques for ranking and filtering PDP and ICE plots and build upon existing strategies for clustering the lines in ICE plots. Together, these techniques aim to help ML practitioners efficiently explore PDP and ICE plots and identify interesting model behavior. We integrate these techniques into PDPilot, a visual analytics tool that runs in Jupyter notebooks. We use PDPilot to study how 7 ML practitioners utilize the ranking, filtering, and clustering techniques to analyze an ML model.