Visual InformaticsPub Date : 2024-12-01DOI: 10.1016/j.visinf.2024.10.004
Jie Liu, Jie Li, Jielong Kuang
{"title":"Generative model-assisted sample selection for interest-driven progressive visual analytics","authors":"Jie Liu, Jie Li, Jielong Kuang","doi":"10.1016/j.visinf.2024.10.004","DOIUrl":"10.1016/j.visinf.2024.10.004","url":null,"abstract":"<div><div>We propose interest-driven progressive visual analytics. The core idea is to filter samples with features of interest to analysts from the given dataset for analysis. The approach relies on a generative model (GM) trained using the given dataset as the training set. The GM characteristics make it convenient to find ideal generated samples from its latent space. Then, we filter the original samples similar to the ideal generated ones to explore patterns. Our research involves two methods for achieving and applying the idea. First, we give a method to explore ideal samples from a GM’s latent space. Second, we integrate the method into a system to form an embedding-based analytical workflow. Patterns found on open datasets in case studies, results of quantitative experiments, and positive feedback from experts illustrate the general usability and effectiveness of the approach.</div></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 4","pages":"Pages 97-108"},"PeriodicalIF":3.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143150181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Visual InformaticsPub Date : 2024-12-01DOI: 10.1016/j.visinf.2024.09.006
Magnus Nylin , Jonas Lundberg , Magnus Bång , Kostiantyn Kucher
{"title":"Glyph design for communication initiation in real-time human-automation collaboration","authors":"Magnus Nylin , Jonas Lundberg , Magnus Bång , Kostiantyn Kucher","doi":"10.1016/j.visinf.2024.09.006","DOIUrl":"10.1016/j.visinf.2024.09.006","url":null,"abstract":"<div><div>Initiating communication and conveying critical information to the human operator is a key problem in human-automation collaboration. This problem is particularly pronounced in time-constrained safety critical domains such as in Air Traffic Management. A visual representation should aid operators understanding <em>why</em> the system initiates the communication, <em>when</em> the operator must act, and the <em>consequences of not responding</em> to the cue. Data <em>glyphs</em> can be used to present multidimensional data, including temporal data in a compact format to facilitate this type of communication. In this paper, we propose a glyph design for communication initialization for highly automated systems in Air Traffic Management, Vessel Traffic Service, and Train Traffic Management. The design was assessed by experts in these domains in three workshop sessions. The results showed that the number of glyphs to be presented simultaneously and the type of situation were domain-specific glyph design aspects that needed to be adjusted for each work domain. The results also showed that the core of the glyph design could be reused between domains, and that the operators could successfully interpret the temporal data representations. We discuss similarities and differences in the applicability of the glyph design between the different domains, and finally, we provide some suggestions for future work based on the results from this study.</div></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 4","pages":"Pages 23-35"},"PeriodicalIF":3.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143098851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Visual InformaticsPub Date : 2024-12-01DOI: 10.1016/j.visinf.2024.10.003
Fang Zhu , Xufei Zhu , Xumeng Wang , Yuxin Ma , Jieqiong Zhao
{"title":"ATVis: Understanding and diagnosing adversarial training processes through visual analytics","authors":"Fang Zhu , Xufei Zhu , Xumeng Wang , Yuxin Ma , Jieqiong Zhao","doi":"10.1016/j.visinf.2024.10.003","DOIUrl":"10.1016/j.visinf.2024.10.003","url":null,"abstract":"<div><div>Adversarial training has emerged as a major strategy against adversarial perturbations in deep neural networks, which mitigates the issue of exploiting model vulnerabilities to generate incorrect predictions. Despite enhancing robustness, adversarial training often results in a trade-off with standard accuracy on normal data, a phenomenon that remains a contentious issue. In addition, the opaque nature of deep neural network models renders it more difficult to inspect and diagnose how adversarial training processes evolve. This paper introduces ATVis, a visual analytics framework for examining and diagnosing adversarial training processes. Through multi-level visualization design, ATVis enables the examination of model robustness from various granularity, facilitating a detailed understanding of the dynamics in the training epochs. The framework reveals the complex relationship between adversarial robustness and standard accuracy, which further offers insights into the mechanisms that drive the trade-offs observed in adversarial training. The effectiveness of the framework is demonstrated through case studies.</div></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 4","pages":"Pages 71-84"},"PeriodicalIF":3.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143098849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Visual InformaticsPub Date : 2024-12-01DOI: 10.1016/j.visinf.2024.10.005
João Moreira , Daniel Mendes , Daniel Gonçalves
{"title":"Incidental visualizations: How complexity factors influence task performance","authors":"João Moreira , Daniel Mendes , Daniel Gonçalves","doi":"10.1016/j.visinf.2024.10.005","DOIUrl":"10.1016/j.visinf.2024.10.005","url":null,"abstract":"<div><div>Incidental visualizations convey information to a person during an ongoing primary task, without the person consciously searching for or requesting that information. They differ from glanceable visualizations by not being people’s main focus, and from ambient visualizations by not being embedded in the environment. Instead, they are presented as secondary information that can be observed without a person losing focus on their current task. However, despite extensive research on glanceable and ambient visualizations, the topic of incidental visualizations is yet a novel topic in current research. To bridge this gap, we conducted an empirical user study presenting participants with an incidental visualization while performing a primary task. We aimed to understand how complexity contributory factors — task complexity, output complexity, and pressure — affected primary task performance and incidental visualization accuracy. Our findings showed that incidental visualizations effectively conveyed information without disrupting the primary task, but working memory limitations should be considered. Additionally, output and pressure significantly influenced the primary task’s results. In conclusion, our study provides insights into the perception accuracy and performance impact of incidental visualizations in relation to complexity factors.</div></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 4","pages":"Pages 85-96"},"PeriodicalIF":3.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143098850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Visual InformaticsPub Date : 2024-12-01DOI: 10.1016/j.visinf.2024.09.007
An-An Liu , Quanhan Wu , Chenxi Huang , Chao Xue , Xianzhu Liu , Ning Xu
{"title":"A fine-grained deconfounding study for knowledge-based visual dialog","authors":"An-An Liu , Quanhan Wu , Chenxi Huang , Chao Xue , Xianzhu Liu , Ning Xu","doi":"10.1016/j.visinf.2024.09.007","DOIUrl":"10.1016/j.visinf.2024.09.007","url":null,"abstract":"<div><div>Knowledge-based Visual Dialog is a challenging vision-language task, where an agent engages in dialog to answer questions with humans based on the input image and corresponding commonsense knowledge. The debiasing methods based on causal graphs have gradually sparked much attention in the field of Visual Dialog (VD), yielding impressive achievements. However, existing studies focus on the coarse-grained deconfounding, which lacks a principled analysis of the bias. In this paper, we propose a fined-grained study of deconfounding on: (1) We define the confounder from two perspectives. The first is user preference (denoted as <span><math><msub><mrow><mi>U</mi></mrow><mrow><mi>h</mi></mrow></msub></math></span>), derived from human-annotated dialog history, which may introduce spurious correlations between questions and answers. The second is commonsense language bias (denoted as <span><math><msub><mrow><mi>U</mi></mrow><mrow><mi>c</mi></mrow></msub></math></span>), where certain words appear so frequently in the retrieved commonsense knowledge that the model tends to memorize these patterns, thereby establishing spurious correlations between the commonsense knowledge and the answers. (2) Given that the current question directly influences answer generation, we further decompose the confounders into <span><math><mrow><msub><mrow><mi>U</mi></mrow><mrow><mi>h</mi><mn>1</mn></mrow></msub><mo>,</mo><msub><mrow><mi>U</mi></mrow><mrow><mi>h</mi><mn>2</mn></mrow></msub></mrow></math></span> and <span><math><mrow><msub><mrow><mi>U</mi></mrow><mrow><mi>c</mi><mn>1</mn></mrow></msub><mo>,</mo><msub><mrow><mi>U</mi></mrow><mrow><mi>c</mi><mn>2</mn></mrow></msub></mrow></math></span>, based on their relevance to the current question. Specifically, <span><math><msub><mrow><mi>U</mi></mrow><mrow><mi>h</mi><mn>1</mn></mrow></msub></math></span> and <span><math><msub><mrow><mi>U</mi></mrow><mrow><mi>c</mi><mn>1</mn></mrow></msub></math></span> represent dialog history and high-frequency words that are highly correlated with the current question, while <span><math><msub><mrow><mi>U</mi></mrow><mrow><mi>h</mi><mn>2</mn></mrow></msub></math></span> and <span><math><msub><mrow><mi>U</mi></mrow><mrow><mi>c</mi><mn>2</mn></mrow></msub></math></span> are sampled from dialog history and words with low relevance to the current question. Through a comprehensive evaluation and comparison of all components, we demonstrate the necessity of jointly considering both <span><math><msub><mrow><mi>U</mi></mrow><mrow><mi>h</mi></mrow></msub></math></span> and <span><math><msub><mrow><mi>U</mi></mrow><mrow><mi>c</mi></mrow></msub></math></span>. Fine-grained deconfounding, particularly with respect to the current question, proves to be more effective. Ablation studies, quantitative results, and visualizations further confirm the effectiveness of the proposed method.</div></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 4","pages":"Pages 36-47"},"PeriodicalIF":3.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143098852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Intelligent CAD 2.0","authors":"Qiang Zou, Yingcai Wu, Zhenyu Liu, Weiwei Xu, Shuming Gao","doi":"10.1016/j.visinf.2024.10.001","DOIUrl":"10.1016/j.visinf.2024.10.001","url":null,"abstract":"<div><div>Integrating modern artificial intelligence (AI) techniques, particularly generative AI, holds the promise of revolutionizing computer-aided design (CAD) tools and the engineering design process. However, the direction of “AI+CAD” remains unclear: how will the current generation of intelligent CAD (ICAD) differ from its predecessor in the 1980s and 1990s, what strategic pathways should researchers and engineers pursue for its implementation, and what potential technical challenges might arise?</div><div>As an attempt to address these questions, this paper investigates the transformative role of modern AI techniques in advancing CAD towards ICAD. It first analyzes the design process and reconsiders the roles AI techniques can assume in this process, highlighting how they can restructure the path humans, computers, and designs interact with each other. The primary conclusion is that ICAD systems should assume an intensional rather than extensional role in the design process. This offers insights into the evaluation of the previous generation of ICAD (ICAD 1.0) and outlines a prospective framework and trajectory for the next generation of ICAD (ICAD 2.0).</div></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 4","pages":"Pages 1-12"},"PeriodicalIF":3.8,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142586829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Visual InformaticsPub Date : 2024-09-01DOI: 10.1016/j.visinf.2024.09.002
Shuangcheng Jiao , Jiang Cheng , Zhaosong Huang , Tong Li , Tiankai Xie , Wei Chen , Yuxin Ma , Xumeng Wang
{"title":"DPKnob: A visual analysis approach to risk-aware formulation of differential privacy schemes for data query scenarios","authors":"Shuangcheng Jiao , Jiang Cheng , Zhaosong Huang , Tong Li , Tiankai Xie , Wei Chen , Yuxin Ma , Xumeng Wang","doi":"10.1016/j.visinf.2024.09.002","DOIUrl":"10.1016/j.visinf.2024.09.002","url":null,"abstract":"<div><div>Differential privacy is an essential approach for privacy preservation in data queries. However, users face a significant challenge in selecting an appropriate privacy scheme, as they struggle to balance the utility of query results with the preservation of diverse individual privacy. Customizing a privacy scheme becomes even more complex in dealing with queries that involve multiple data attributes. When adversaries attempt to breach privacy firewalls by conducting multiple regular data queries with various attribute values, data owners must arduously discern unpredictable disclosure risks and construct suitable privacy schemes. In this paper, we propose a visual analysis approach for formulating privacy schemes of differential privacy. Our approach supports the identification and simulation of potential privacy attacks in querying statistical results of multi-dimensional databases. We also developed a prototype system, called DPKnob, which integrates multiple coordinated views. DPKnob not only allows users to interactively assess and explore privacy exposure risks by browsing high-risk attacks, but also facilitates an iterative process for formulating and optimizing privacy schemes based on differential privacy. This iterative process allows users to compare different schemes, refine their expectations of privacy and utility, and ultimately establish a well-balanced privacy scheme. The effectiveness of this study is verified by a user study and two case studies with real-world datasets.</div></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 3","pages":"Pages 42-52"},"PeriodicalIF":3.8,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142358092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Visual InformaticsPub Date : 2024-09-01DOI: 10.1016/j.visinf.2024.09.001
Shaoxu Meng , Tong Wu , Fang-Lue Zhang , Shu-Yu Chen , Yuewen Ma , Wenbo Hu , Lin Gao
{"title":"AvatarWild: Fully controllable head avatars in the wild","authors":"Shaoxu Meng , Tong Wu , Fang-Lue Zhang , Shu-Yu Chen , Yuewen Ma , Wenbo Hu , Lin Gao","doi":"10.1016/j.visinf.2024.09.001","DOIUrl":"10.1016/j.visinf.2024.09.001","url":null,"abstract":"<div><div>Recent advancements in the field have resulted in significant progress in achieving realistic head reconstruction and manipulation using neural radiance fields (NeRF). Despite these advances, capturing intricate facial details remains a persistent challenge. Moreover, casually captured input, involving both head poses and camera movements, introduces additional difficulties to existing methods of head avatar reconstruction. To address the challenge posed by video data captured with camera motion, we propose a novel method, AvatarWild, for reconstructing head avatars from monocular videos taken by consumer devices. Notably, our approach decouples the camera pose and head pose, allowing reconstructed avatars to be visualized with different poses and expressions from novel viewpoints. To enhance the visual quality of the reconstructed facial avatar, we introduce a view-dependent detail enhancement module designed to augment local facial details without compromising viewpoint consistency. Our method demonstrates superior performance compared to existing approaches, as evidenced by reconstruction and animation results on both multi-view and single-view datasets. Remarkably, our approach stands out by exclusively relying on video data captured by portable devices, such as smartphones. This not only underscores the practicality of our method but also extends its applicability to real-world scenarios where accessibility and ease of data capture are crucial.</div></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 3","pages":"Pages 96-106"},"PeriodicalIF":3.8,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142442832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Visual InformaticsPub Date : 2024-09-01DOI: 10.1016/j.visinf.2024.08.002
Han Bao , Xuhong Zhang , Qinying Wang , Kangming Liang , Zonghui Wang , Shouling Ji , Wenzhi Chen
{"title":"MILG: Realistic lip-sync video generation with audio-modulated image inpainting","authors":"Han Bao , Xuhong Zhang , Qinying Wang , Kangming Liang , Zonghui Wang , Shouling Ji , Wenzhi Chen","doi":"10.1016/j.visinf.2024.08.002","DOIUrl":"10.1016/j.visinf.2024.08.002","url":null,"abstract":"<div><div>Existing lip synchronization (lip-sync) methods generate accurately synchronized mouths and faces in a generated video. However, they still confront the problem of artifacts in regions of non-interest (RONI), <em>e.g.</em>, background and other parts of a face, which decreases the overall visual quality. To solve these problems, we innovatively introduce diverse image inpainting to lip-sync generation. We propose Modulated Inpainting Lip-sync GAN (MILG), an audio-constraint inpainting network to predict synchronous mouths. MILG utilizes prior knowledge of RONI and audio sequences to predict lip shape instead of image generation, which can keep the RONI consistent. Specifically, we integrate modulated spatially probabilistic diversity normalization (MSPD Norm) in our inpainting network, which helps the network generate fine-grained diverse mouth movements guided by the continuous audio features. Furthermore, to lower the training overhead, we modify the contrastive loss in lip-sync to support small-batch-size and few-sample training. Extensive experiments demonstrate that our approach outperforms the existing state-of-the-art of image quality and authenticity while keeping lip-sync.</div></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 3","pages":"Pages 71-81"},"PeriodicalIF":3.8,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142417742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Visual InformaticsPub Date : 2024-09-01DOI: 10.1016/j.visinf.2024.09.005
Tong Zhang, Jie Li, Chao Xu
{"title":"Visual exploration of multi-dimensional data via rule-based sample embedding","authors":"Tong Zhang, Jie Li, Chao Xu","doi":"10.1016/j.visinf.2024.09.005","DOIUrl":"10.1016/j.visinf.2024.09.005","url":null,"abstract":"<div><div>We propose an approach to learning sample embedding for analyzing multi-dimensional datasets. The basic idea is to extract rules from the given dataset and learn the embedding for each sample based on the rules it satisfies. The approach can filter out pattern-irrelevant attributes, leading to significant visual structures of samples satisfying the same rules in the projection. In addition, analysts can understand a visual structure based on the rules that the involved samples satisfy, which improves the projection’s pattern interpretability. Our research involves two methods for achieving and applying the approach. First, we give a method to learn rule-based embedding for each sample. Second, we integrate the method into a system to achieve an analytical workflow. Cases on real-world dataset and quantitative experiment results show the usability and effectiveness of our approach.</div></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 3","pages":"Pages 53-56"},"PeriodicalIF":3.8,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142417741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}