Visual InformaticsPub Date : 2024-12-01DOI: 10.1016/j.visinf.2024.11.002
Júlio Castro Lopes, Rui Pedro Lopes
{"title":"Computer Vision in Augmented, Virtual, Mixed and Extended Reality environments—A bibliometric review","authors":"Júlio Castro Lopes, Rui Pedro Lopes","doi":"10.1016/j.visinf.2024.11.002","DOIUrl":"10.1016/j.visinf.2024.11.002","url":null,"abstract":"<div><div>This work describes a bibliometric analysis of the literature on the use of computer vision algorithms in Augmented Reality (AR), Virtual Reality (VR), Mixed Reality (MR), and Extended Reality (XR) environments. The analysis aims to highlight the evolution, trends, and effects of research in this field. This review provides an overview of immersive technologies and their applications, as well as the role of computer vision algorithms in enabling these technologies and the potential benefits of using such algorithms. This study identifies important authors, institutions, and research themes by using bibliometric indicators such as citation counts, co-citation analysis, and network analysis. The analysis also identifies gaps and opportunities for additional research in this area, as well as a critical assessment of the quality and relevance of the publications.</div></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 4","pages":"Pages 13-22"},"PeriodicalIF":3.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143098854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Visual InformaticsPub Date : 2024-12-01DOI: 10.1016/j.visinf.2024.10.004
Jie Liu, Jie Li, Jielong Kuang
{"title":"Generative model-assisted sample selection for interest-driven progressive visual analytics","authors":"Jie Liu, Jie Li, Jielong Kuang","doi":"10.1016/j.visinf.2024.10.004","DOIUrl":"10.1016/j.visinf.2024.10.004","url":null,"abstract":"<div><div>We propose interest-driven progressive visual analytics. The core idea is to filter samples with features of interest to analysts from the given dataset for analysis. The approach relies on a generative model (GM) trained using the given dataset as the training set. The GM characteristics make it convenient to find ideal generated samples from its latent space. Then, we filter the original samples similar to the ideal generated ones to explore patterns. Our research involves two methods for achieving and applying the idea. First, we give a method to explore ideal samples from a GM’s latent space. Second, we integrate the method into a system to form an embedding-based analytical workflow. Patterns found on open datasets in case studies, results of quantitative experiments, and positive feedback from experts illustrate the general usability and effectiveness of the approach.</div></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 4","pages":"Pages 97-108"},"PeriodicalIF":3.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143150181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Visual InformaticsPub Date : 2024-12-01DOI: 10.1016/j.visinf.2024.10.002
Yang Zhang, Jie Li, Xu Chao
{"title":"ChemNav: An interactive visual tool to navigate in the latent space for chemical molecules discovery","authors":"Yang Zhang, Jie Li, Xu Chao","doi":"10.1016/j.visinf.2024.10.002","DOIUrl":"10.1016/j.visinf.2024.10.002","url":null,"abstract":"<div><div>In recent years, AI-driven drug development has emerged as a prominent research topic in computer chemistry. A key focus is the application of generative models for molecule synthesis, which create extensive virtual libraries of chemical molecules based on latent spaces. However, locating molecules with desirable properties within the vast latent spaces remains a significant challenge. Large regions of invalid samples in the latent space, called “dead zones”, can impede the exploration efficiency. The process is always time-consuming and repetitive. Therefore, we aim to propose a visualization system to help experts identify potential molecules with desirable properties as they wander in the latent space. Specifically, we conducted a literature survey about the application of generative networks in drug synthesis to summarize the tasks and followed this with expert interviews to determine their requirements. Based on the above requirements, we introduce ChemNav, an interactive visual tool for navigating latent space for desirable molecules search. ChemNav incorporates a heuristic latent space interpolation path search algorithm to enhance the efficiency of valid molecule generation, and a similar sample search algorithm to accelerate the discovery of similar molecules. Evaluations of ChemNav through two case studies, a user study, and experiments demonstrated its effectiveness in inspiring researchers to explore the latent space for chemical molecule discovery.</div></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 4","pages":"Pages 60-70"},"PeriodicalIF":3.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143150182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Visual InformaticsPub Date : 2024-12-01DOI: 10.1016/j.visinf.2024.09.006
Magnus Nylin , Jonas Lundberg , Magnus Bång , Kostiantyn Kucher
{"title":"Glyph design for communication initiation in real-time human-automation collaboration","authors":"Magnus Nylin , Jonas Lundberg , Magnus Bång , Kostiantyn Kucher","doi":"10.1016/j.visinf.2024.09.006","DOIUrl":"10.1016/j.visinf.2024.09.006","url":null,"abstract":"<div><div>Initiating communication and conveying critical information to the human operator is a key problem in human-automation collaboration. This problem is particularly pronounced in time-constrained safety critical domains such as in Air Traffic Management. A visual representation should aid operators understanding <em>why</em> the system initiates the communication, <em>when</em> the operator must act, and the <em>consequences of not responding</em> to the cue. Data <em>glyphs</em> can be used to present multidimensional data, including temporal data in a compact format to facilitate this type of communication. In this paper, we propose a glyph design for communication initialization for highly automated systems in Air Traffic Management, Vessel Traffic Service, and Train Traffic Management. The design was assessed by experts in these domains in three workshop sessions. The results showed that the number of glyphs to be presented simultaneously and the type of situation were domain-specific glyph design aspects that needed to be adjusted for each work domain. The results also showed that the core of the glyph design could be reused between domains, and that the operators could successfully interpret the temporal data representations. We discuss similarities and differences in the applicability of the glyph design between the different domains, and finally, we provide some suggestions for future work based on the results from this study.</div></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 4","pages":"Pages 23-35"},"PeriodicalIF":3.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143098851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Visual InformaticsPub Date : 2024-12-01DOI: 10.1016/j.visinf.2024.10.003
Fang Zhu , Xufei Zhu , Xumeng Wang , Yuxin Ma , Jieqiong Zhao
{"title":"ATVis: Understanding and diagnosing adversarial training processes through visual analytics","authors":"Fang Zhu , Xufei Zhu , Xumeng Wang , Yuxin Ma , Jieqiong Zhao","doi":"10.1016/j.visinf.2024.10.003","DOIUrl":"10.1016/j.visinf.2024.10.003","url":null,"abstract":"<div><div>Adversarial training has emerged as a major strategy against adversarial perturbations in deep neural networks, which mitigates the issue of exploiting model vulnerabilities to generate incorrect predictions. Despite enhancing robustness, adversarial training often results in a trade-off with standard accuracy on normal data, a phenomenon that remains a contentious issue. In addition, the opaque nature of deep neural network models renders it more difficult to inspect and diagnose how adversarial training processes evolve. This paper introduces ATVis, a visual analytics framework for examining and diagnosing adversarial training processes. Through multi-level visualization design, ATVis enables the examination of model robustness from various granularity, facilitating a detailed understanding of the dynamics in the training epochs. The framework reveals the complex relationship between adversarial robustness and standard accuracy, which further offers insights into the mechanisms that drive the trade-offs observed in adversarial training. The effectiveness of the framework is demonstrated through case studies.</div></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 4","pages":"Pages 71-84"},"PeriodicalIF":3.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143098849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Visual InformaticsPub Date : 2024-12-01DOI: 10.1016/j.visinf.2024.10.005
João Moreira , Daniel Mendes , Daniel Gonçalves
{"title":"Incidental visualizations: How complexity factors influence task performance","authors":"João Moreira , Daniel Mendes , Daniel Gonçalves","doi":"10.1016/j.visinf.2024.10.005","DOIUrl":"10.1016/j.visinf.2024.10.005","url":null,"abstract":"<div><div>Incidental visualizations convey information to a person during an ongoing primary task, without the person consciously searching for or requesting that information. They differ from glanceable visualizations by not being people’s main focus, and from ambient visualizations by not being embedded in the environment. Instead, they are presented as secondary information that can be observed without a person losing focus on their current task. However, despite extensive research on glanceable and ambient visualizations, the topic of incidental visualizations is yet a novel topic in current research. To bridge this gap, we conducted an empirical user study presenting participants with an incidental visualization while performing a primary task. We aimed to understand how complexity contributory factors — task complexity, output complexity, and pressure — affected primary task performance and incidental visualization accuracy. Our findings showed that incidental visualizations effectively conveyed information without disrupting the primary task, but working memory limitations should be considered. Additionally, output and pressure significantly influenced the primary task’s results. In conclusion, our study provides insights into the perception accuracy and performance impact of incidental visualizations in relation to complexity factors.</div></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 4","pages":"Pages 85-96"},"PeriodicalIF":3.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143098850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Visual InformaticsPub Date : 2024-12-01DOI: 10.1016/j.visinf.2024.09.007
An-An Liu , Quanhan Wu , Chenxi Huang , Chao Xue , Xianzhu Liu , Ning Xu
{"title":"A fine-grained deconfounding study for knowledge-based visual dialog","authors":"An-An Liu , Quanhan Wu , Chenxi Huang , Chao Xue , Xianzhu Liu , Ning Xu","doi":"10.1016/j.visinf.2024.09.007","DOIUrl":"10.1016/j.visinf.2024.09.007","url":null,"abstract":"<div><div>Knowledge-based Visual Dialog is a challenging vision-language task, where an agent engages in dialog to answer questions with humans based on the input image and corresponding commonsense knowledge. The debiasing methods based on causal graphs have gradually sparked much attention in the field of Visual Dialog (VD), yielding impressive achievements. However, existing studies focus on the coarse-grained deconfounding, which lacks a principled analysis of the bias. In this paper, we propose a fined-grained study of deconfounding on: (1) We define the confounder from two perspectives. The first is user preference (denoted as <span><math><msub><mrow><mi>U</mi></mrow><mrow><mi>h</mi></mrow></msub></math></span>), derived from human-annotated dialog history, which may introduce spurious correlations between questions and answers. The second is commonsense language bias (denoted as <span><math><msub><mrow><mi>U</mi></mrow><mrow><mi>c</mi></mrow></msub></math></span>), where certain words appear so frequently in the retrieved commonsense knowledge that the model tends to memorize these patterns, thereby establishing spurious correlations between the commonsense knowledge and the answers. (2) Given that the current question directly influences answer generation, we further decompose the confounders into <span><math><mrow><msub><mrow><mi>U</mi></mrow><mrow><mi>h</mi><mn>1</mn></mrow></msub><mo>,</mo><msub><mrow><mi>U</mi></mrow><mrow><mi>h</mi><mn>2</mn></mrow></msub></mrow></math></span> and <span><math><mrow><msub><mrow><mi>U</mi></mrow><mrow><mi>c</mi><mn>1</mn></mrow></msub><mo>,</mo><msub><mrow><mi>U</mi></mrow><mrow><mi>c</mi><mn>2</mn></mrow></msub></mrow></math></span>, based on their relevance to the current question. Specifically, <span><math><msub><mrow><mi>U</mi></mrow><mrow><mi>h</mi><mn>1</mn></mrow></msub></math></span> and <span><math><msub><mrow><mi>U</mi></mrow><mrow><mi>c</mi><mn>1</mn></mrow></msub></math></span> represent dialog history and high-frequency words that are highly correlated with the current question, while <span><math><msub><mrow><mi>U</mi></mrow><mrow><mi>h</mi><mn>2</mn></mrow></msub></math></span> and <span><math><msub><mrow><mi>U</mi></mrow><mrow><mi>c</mi><mn>2</mn></mrow></msub></math></span> are sampled from dialog history and words with low relevance to the current question. Through a comprehensive evaluation and comparison of all components, we demonstrate the necessity of jointly considering both <span><math><msub><mrow><mi>U</mi></mrow><mrow><mi>h</mi></mrow></msub></math></span> and <span><math><msub><mrow><mi>U</mi></mrow><mrow><mi>c</mi></mrow></msub></math></span>. Fine-grained deconfounding, particularly with respect to the current question, proves to be more effective. Ablation studies, quantitative results, and visualizations further confirm the effectiveness of the proposed method.</div></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 4","pages":"Pages 36-47"},"PeriodicalIF":3.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143098852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Intelligent CAD 2.0","authors":"Qiang Zou, Yingcai Wu, Zhenyu Liu, Weiwei Xu, Shuming Gao","doi":"10.1016/j.visinf.2024.10.001","DOIUrl":"10.1016/j.visinf.2024.10.001","url":null,"abstract":"<div><div>Integrating modern artificial intelligence (AI) techniques, particularly generative AI, holds the promise of revolutionizing computer-aided design (CAD) tools and the engineering design process. However, the direction of “AI+CAD” remains unclear: how will the current generation of intelligent CAD (ICAD) differ from its predecessor in the 1980s and 1990s, what strategic pathways should researchers and engineers pursue for its implementation, and what potential technical challenges might arise?</div><div>As an attempt to address these questions, this paper investigates the transformative role of modern AI techniques in advancing CAD towards ICAD. It first analyzes the design process and reconsiders the roles AI techniques can assume in this process, highlighting how they can restructure the path humans, computers, and designs interact with each other. The primary conclusion is that ICAD systems should assume an intensional rather than extensional role in the design process. This offers insights into the evaluation of the previous generation of ICAD (ICAD 1.0) and outlines a prospective framework and trajectory for the next generation of ICAD (ICAD 2.0).</div></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 4","pages":"Pages 1-12"},"PeriodicalIF":3.8,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142586829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Visual InformaticsPub Date : 2024-09-01DOI: 10.1016/j.visinf.2024.09.002
Shuangcheng Jiao , Jiang Cheng , Zhaosong Huang , Tong Li , Tiankai Xie , Wei Chen , Yuxin Ma , Xumeng Wang
{"title":"DPKnob: A visual analysis approach to risk-aware formulation of differential privacy schemes for data query scenarios","authors":"Shuangcheng Jiao , Jiang Cheng , Zhaosong Huang , Tong Li , Tiankai Xie , Wei Chen , Yuxin Ma , Xumeng Wang","doi":"10.1016/j.visinf.2024.09.002","DOIUrl":"10.1016/j.visinf.2024.09.002","url":null,"abstract":"<div><div>Differential privacy is an essential approach for privacy preservation in data queries. However, users face a significant challenge in selecting an appropriate privacy scheme, as they struggle to balance the utility of query results with the preservation of diverse individual privacy. Customizing a privacy scheme becomes even more complex in dealing with queries that involve multiple data attributes. When adversaries attempt to breach privacy firewalls by conducting multiple regular data queries with various attribute values, data owners must arduously discern unpredictable disclosure risks and construct suitable privacy schemes. In this paper, we propose a visual analysis approach for formulating privacy schemes of differential privacy. Our approach supports the identification and simulation of potential privacy attacks in querying statistical results of multi-dimensional databases. We also developed a prototype system, called DPKnob, which integrates multiple coordinated views. DPKnob not only allows users to interactively assess and explore privacy exposure risks by browsing high-risk attacks, but also facilitates an iterative process for formulating and optimizing privacy schemes based on differential privacy. This iterative process allows users to compare different schemes, refine their expectations of privacy and utility, and ultimately establish a well-balanced privacy scheme. The effectiveness of this study is verified by a user study and two case studies with real-world datasets.</div></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 3","pages":"Pages 42-52"},"PeriodicalIF":3.8,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142358092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Visual InformaticsPub Date : 2024-09-01DOI: 10.1016/j.visinf.2024.09.001
Shaoxu Meng , Tong Wu , Fang-Lue Zhang , Shu-Yu Chen , Yuewen Ma , Wenbo Hu , Lin Gao
{"title":"AvatarWild: Fully controllable head avatars in the wild","authors":"Shaoxu Meng , Tong Wu , Fang-Lue Zhang , Shu-Yu Chen , Yuewen Ma , Wenbo Hu , Lin Gao","doi":"10.1016/j.visinf.2024.09.001","DOIUrl":"10.1016/j.visinf.2024.09.001","url":null,"abstract":"<div><div>Recent advancements in the field have resulted in significant progress in achieving realistic head reconstruction and manipulation using neural radiance fields (NeRF). Despite these advances, capturing intricate facial details remains a persistent challenge. Moreover, casually captured input, involving both head poses and camera movements, introduces additional difficulties to existing methods of head avatar reconstruction. To address the challenge posed by video data captured with camera motion, we propose a novel method, AvatarWild, for reconstructing head avatars from monocular videos taken by consumer devices. Notably, our approach decouples the camera pose and head pose, allowing reconstructed avatars to be visualized with different poses and expressions from novel viewpoints. To enhance the visual quality of the reconstructed facial avatar, we introduce a view-dependent detail enhancement module designed to augment local facial details without compromising viewpoint consistency. Our method demonstrates superior performance compared to existing approaches, as evidenced by reconstruction and animation results on both multi-view and single-view datasets. Remarkably, our approach stands out by exclusively relying on video data captured by portable devices, such as smartphones. This not only underscores the practicality of our method but also extends its applicability to real-world scenarios where accessibility and ease of data capture are crucial.</div></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"8 3","pages":"Pages 96-106"},"PeriodicalIF":3.8,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142442832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}