R. Hafen, Luke J. Gosink, J. Mcdermott, Karin D. Rodland, K. K. Dam, W. Cleveland
{"title":"Trelliscope: A system for detailed visualization in the deep analysis of large complex data","authors":"R. Hafen, Luke J. Gosink, J. Mcdermott, Karin D. Rodland, K. K. Dam, W. Cleveland","doi":"10.1109/LDAV.2013.6675164","DOIUrl":null,"url":null,"abstract":"Trelliscope emanates from the Trellis Display framework for visualization and the Divide and Recombine (D&R) approach to analyzing large complex data. In Trellis, the data are broken up into subsets, a visualization method is applied to each subset, and the display result is an array of panels, one per subset. This is a powerful framework for visualization of data, both small and large. In D&R, the data are broken up into subsets, and any analytic method from statistics and machine learning is applied to each subset independently. Then the outputs are recombined. This provides not only a powerful framework for analysis, but also feasible and practical computations using distributed computational facilities. It enables deep analysis of the data: study of both data summaries as well as the detailed data at their finest granularity. This is critical to full understanding of the data. It also enables the analyst to program using an interactive high-level language for data analysis such as R, which allows the analyst to focus more on the data and less on code. In this paper we introduce Trelliscope, a system that scales Trellis to large complex data. It provides a way to create displays with a very large number of panels and an interactive viewer that allows the analyst to sort, filter, and sample the panels in a meaningful way. We discuss the underlying principles, design, and scalable architecture of Trelliscope, and illustrate its use on three analysis projects in proteomics, high intensity physics, and power systems engineering.","PeriodicalId":266607,"journal":{"name":"2013 IEEE Symposium on Large-Scale Data Analysis and Visualization (LDAV)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2013-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE Symposium on Large-Scale Data Analysis and Visualization (LDAV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/LDAV.2013.6675164","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22
Abstract
Trelliscope emanates from the Trellis Display framework for visualization and the Divide and Recombine (D&R) approach to analyzing large complex data. In Trellis, the data are broken up into subsets, a visualization method is applied to each subset, and the display result is an array of panels, one per subset. This is a powerful framework for visualization of data, both small and large. In D&R, the data are broken up into subsets, and any analytic method from statistics and machine learning is applied to each subset independently. Then the outputs are recombined. This provides not only a powerful framework for analysis, but also feasible and practical computations using distributed computational facilities. It enables deep analysis of the data: study of both data summaries as well as the detailed data at their finest granularity. This is critical to full understanding of the data. It also enables the analyst to program using an interactive high-level language for data analysis such as R, which allows the analyst to focus more on the data and less on code. In this paper we introduce Trelliscope, a system that scales Trellis to large complex data. It provides a way to create displays with a very large number of panels and an interactive viewer that allows the analyst to sort, filter, and sample the panels in a meaningful way. We discuss the underlying principles, design, and scalable architecture of Trelliscope, and illustrate its use on three analysis projects in proteomics, high intensity physics, and power systems engineering.
Trelliscope源于用于可视化的Trellis Display框架和用于分析大型复杂数据的Divide and recombination (D&R)方法。在Trellis中,数据被分解成子集,对每个子集应用可视化方法,显示结果是一个面板数组,每个子集一个。这是一个强大的数据可视化框架,无论大小。在D&R中,数据被分解成子集,统计学和机器学习中的任何分析方法都被独立地应用于每个子集。然后重新组合输出。这不仅提供了一个强大的分析框架,而且还提供了使用分布式计算设施进行可行和实用的计算。它可以对数据进行深入分析:既可以研究数据摘要,也可以研究最细粒度的详细数据。这对于充分理解数据至关重要。它还使分析人员能够使用交互式高级语言(如R)进行数据分析,从而使分析人员能够更多地关注数据,而不是代码。本文介绍了一个将网格扩展到大型复杂数据的系统Trelliscope。它提供了一种方法来创建具有大量面板和交互式查看器的显示,该查看器允许分析人员以有意义的方式对面板进行排序、过滤和采样。我们讨论了Trelliscope的基本原理、设计和可扩展架构,并说明了它在蛋白质组学、高强度物理和电力系统工程中的三个分析项目中的使用。