eBoF: Interactive Temporal Correlation Analysis for Ensemble Data Based on Bag-of-Features

IF 7.5 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Zhifei Ding;Jiahao Han;Rongtao Qian;Liming Shen;Siru Chen;Lingxin Yu;Yu Zhu;Richen Liu
{"title":"eBoF: Interactive Temporal Correlation Analysis for Ensemble Data Based on Bag-of-Features","authors":"Zhifei Ding;Jiahao Han;Rongtao Qian;Liming Shen;Siru Chen;Lingxin Yu;Yu Zhu;Richen Liu","doi":"10.1109/TBDATA.2023.3324482","DOIUrl":null,"url":null,"abstract":"We propose eBoF, a novel time-varying ensemble data visualization approach based on the Bag-of-Features (BoF) model. In the eBoF model, we extract a simple and monotone interval from all target variables of ensemble scalar data as a local feature patch. Each local feature of a semantically simple single interval can be defined as a feature patch within the BoF model, with the duration of each interval (i.e., feature patch) serving as its frequency. Feature clusters in ensemble runs are then identified based on the similarity of temporal correlations. eBoF generates clusters along with their probability distributions across all feature patches while preserving the geo-spatial information, which is often lost in traditional topic modeling or clustering algorithms. The probability distribution across different clusters can help to generate reasonable clustering results, evaluated by domain knowledge. We conduct case studies and performance tests to evaluate the eBoF model and gather feedback from domain experts to further refine it. Evaluation results suggest the proposed eBoF can provide insightful and comprehensive evidence on ensemble simulation data analysis.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"9 6","pages":"1726-1737"},"PeriodicalIF":7.5000,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Big Data","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10284987/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

We propose eBoF, a novel time-varying ensemble data visualization approach based on the Bag-of-Features (BoF) model. In the eBoF model, we extract a simple and monotone interval from all target variables of ensemble scalar data as a local feature patch. Each local feature of a semantically simple single interval can be defined as a feature patch within the BoF model, with the duration of each interval (i.e., feature patch) serving as its frequency. Feature clusters in ensemble runs are then identified based on the similarity of temporal correlations. eBoF generates clusters along with their probability distributions across all feature patches while preserving the geo-spatial information, which is often lost in traditional topic modeling or clustering algorithms. The probability distribution across different clusters can help to generate reasonable clustering results, evaluated by domain knowledge. We conduct case studies and performance tests to evaluate the eBoF model and gather feedback from domain experts to further refine it. Evaluation results suggest the proposed eBoF can provide insightful and comprehensive evidence on ensemble simulation data analysis.
基于特征袋的集成数据交互时间相关性分析
提出了一种基于特征袋模型的时变集成数据可视化方法。在eBoF模型中,我们从集合标量数据的所有目标变量中提取一个简单单调的区间作为局部特征patch。语义简单的单个区间的每个局部特征可以定义为BoF模型内的一个特征patch,每个区间的持续时间(即特征patch)作为其频率。然后基于时间相关性的相似性来识别集成运行中的特征簇。eBoF生成聚类及其在所有特征块上的概率分布,同时保留了传统主题建模或聚类算法中经常丢失的地理空间信息。不同聚类之间的概率分布有助于生成合理的聚类结果,并通过领域知识进行评估。我们进行案例研究和性能测试来评估eof模型,并从领域专家那里收集反馈以进一步完善它。评价结果表明,该模型可以为集成模拟数据分析提供全面、深刻的依据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
11.80
自引率
2.80%
发文量
114
期刊介绍: The IEEE Transactions on Big Data publishes peer-reviewed articles focusing on big data. These articles present innovative research ideas and application results across disciplines, including novel theories, algorithms, and applications. Research areas cover a wide range, such as big data analytics, visualization, curation, management, semantics, infrastructure, standards, performance analysis, intelligence extraction, scientific discovery, security, privacy, and legal issues specific to big data. The journal also prioritizes applications of big data in fields generating massive datasets.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信