Multiscale Data Analysis Using Binning, Tensor Decompositions, and Backtracking

Dimitri Leggas, Thomas Henretty, J. Ezick, M. Baskaran, Brendan von Hofe, Grace H. Cimaszewski, Harper Langston, R. Lethin
{"title":"Multiscale Data Analysis Using Binning, Tensor Decompositions, and Backtracking","authors":"Dimitri Leggas, Thomas Henretty, J. Ezick, M. Baskaran, Brendan von Hofe, Grace H. Cimaszewski, Harper Langston, R. Lethin","doi":"10.1109/HPEC43674.2020.9286171","DOIUrl":null,"url":null,"abstract":"Large data sets can contain patterns at multiple scales (spatial, temporal, etc.). In practice, it is useful for data exploration techniques to detect patterns at each relevant scale. In this paper, we develop an approach to detect activities at multiple scales using tensor decomposition, an unsupervised high-dimensional data analysis technique that finds correlations between different features in the data. This method typically requires that the feature values are discretized during the construction of the tensor in a process called “binning.” We develop a method of constructing and decomposing tensors with different binning schemes of various features in order to uncover patterns across a set of user-defined scales. While binning is necessary to obtain interpretable results from tensor decompositions, it also decreases the specificity of the data. Thus, we develop backtracking methods that enable the recovery of original source data corresponding to patterns found in the decomposition. These technique are discussed in the context of spatiotemporal and network traffic data, and in particular on Automatic Identification System (AIS) data.","PeriodicalId":168544,"journal":{"name":"2020 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE High Performance Extreme Computing Conference (HPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPEC43674.2020.9286171","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Large data sets can contain patterns at multiple scales (spatial, temporal, etc.). In practice, it is useful for data exploration techniques to detect patterns at each relevant scale. In this paper, we develop an approach to detect activities at multiple scales using tensor decomposition, an unsupervised high-dimensional data analysis technique that finds correlations between different features in the data. This method typically requires that the feature values are discretized during the construction of the tensor in a process called “binning.” We develop a method of constructing and decomposing tensors with different binning schemes of various features in order to uncover patterns across a set of user-defined scales. While binning is necessary to obtain interpretable results from tensor decompositions, it also decreases the specificity of the data. Thus, we develop backtracking methods that enable the recovery of original source data corresponding to patterns found in the decomposition. These technique are discussed in the context of spatiotemporal and network traffic data, and in particular on Automatic Identification System (AIS) data.
多尺度数据分析使用分箱,张量分解和回溯
大型数据集可以包含多个尺度(空间、时间等)的模式。在实践中,对于数据探索技术来说,在每个相关尺度上检测模式是很有用的。在本文中,我们开发了一种使用张量分解来检测多个尺度上的活动的方法,这是一种无监督的高维数据分析技术,可以发现数据中不同特征之间的相关性。这种方法通常要求在构造张量的过程中对特征值进行离散化,这一过程被称为“分割”。我们开发了一种构造和分解具有不同特征的不同分组方案的张量的方法,以便在一组用户定义的尺度上发现模式。虽然从张量分解中获得可解释的结果是必要的,但它也降低了数据的特异性。因此,我们开发了回溯方法,可以恢复与分解中发现的模式相对应的原始源数据。这些技术在时空和网络流量数据的背景下进行了讨论,特别是在自动识别系统(AIS)数据上。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信