InterChat: Enhancing Generative Visual Analytics using Multimodal Interactions

IF 2.9 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum Pub Date : 2025-05-23 DOI:10.1111/cgf.70112

Juntong Chen, Jiang Wu, Jiajing Guo, Vikram Mohanty, Xueming Li, Jorge Piazentin Ono, Wenbin He, Liu Ren, Dongyu Liu

{"title":"InterChat: Enhancing Generative Visual Analytics using Multimodal Interactions","authors":"Juntong Chen, Jiang Wu, Jiajing Guo, Vikram Mohanty, Xueming Li, Jorge Piazentin Ono, Wenbin He, Liu Ren, Dongyu Liu","doi":"10.1111/cgf.70112","DOIUrl":null,"url":null,"abstract":"<p>The rise of Large Language Models (LLMs) and generative visual analytics systems has transformed data-driven insights, yet significant challenges persist in accurately interpreting users analytical and interaction intents. While language inputs offer flexibility, they often lack precision, making the expression of complex intents inefficient, error-prone, and time-intensive. To address these limitations, we investigate the design space of multimodal interactions for generative visual analytics through a literature review and pilot brainstorming sessions. Building on these insights, we introduce a highly extensible workflow that integrates multiple LLM agents for intent inference and visualization generation. We develop InterChat, a generative visual analytics system that combines direct manipulation of visual elements with natural language inputs. This integration enables precise intent communication and supports progressive, visually driven exploratory data analyses. By employing effective prompt engineering, and contextual interaction linking, alongside intuitive visualization and interaction designs, InterChat bridges the gap between user interactions and LLM-driven visualizations, enhancing both interpretability and usability. Extensive evaluations, including two usage scenarios, a user study, and expert feedback, demonstrate the effectiveness of InterChat. Results show significant improvements in the accuracy and efficiency of handling complex visual analytics tasks, highlighting the potential of multimodal interactions to redefine user engagement and analytical depth in generative visual analytics.</p>","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"44 3","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Graphics Forum","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cgf.70112","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

The rise of Large Language Models (LLMs) and generative visual analytics systems has transformed data-driven insights, yet significant challenges persist in accurately interpreting users analytical and interaction intents. While language inputs offer flexibility, they often lack precision, making the expression of complex intents inefficient, error-prone, and time-intensive. To address these limitations, we investigate the design space of multimodal interactions for generative visual analytics through a literature review and pilot brainstorming sessions. Building on these insights, we introduce a highly extensible workflow that integrates multiple LLM agents for intent inference and visualization generation. We develop InterChat, a generative visual analytics system that combines direct manipulation of visual elements with natural language inputs. This integration enables precise intent communication and supports progressive, visually driven exploratory data analyses. By employing effective prompt engineering, and contextual interaction linking, alongside intuitive visualization and interaction designs, InterChat bridges the gap between user interactions and LLM-driven visualizations, enhancing both interpretability and usability. Extensive evaluations, including two usage scenarios, a user study, and expert feedback, demonstrate the effectiveness of InterChat. Results show significant improvements in the accuracy and efficiency of handling complex visual analytics tasks, highlighting the potential of multimodal interactions to redefine user engagement and analytical depth in generative visual analytics.

查看原文本刊更多论文

InterChat：使用多模态交互增强生成视觉分析

大型语言模型（llm）和生成式视觉分析系统的兴起已经改变了数据驱动的见解，但在准确解释用户分析和交互意图方面仍然存在重大挑战。虽然语言输入提供了灵活性，但它们通常缺乏精度，使得复杂意图的表达效率低下、容易出错且耗时。为了解决这些限制，我们通过文献综述和试点头脑风暴会议来研究多模态交互的设计空间，以进行生成视觉分析。基于这些见解，我们引入了一个高度可扩展的工作流，该工作流集成了多个LLM代理，用于意图推理和可视化生成。我们开发了InterChat，这是一个生成式视觉分析系统，将视觉元素的直接操作与自然语言输入相结合。这种集成实现了精确的意图交流，并支持渐进式、可视化驱动的探索性数据分析。通过采用有效的即时工程和上下文交互链接，以及直观的可视化和交互设计，InterChat弥合了用户交互和法学硕士驱动的可视化之间的差距，增强了可解释性和可用性。广泛的评估，包括两个使用场景、一个用户研究和专家反馈，证明了InterChat的有效性。结果显示，处理复杂视觉分析任务的准确性和效率显著提高，突出了多模式交互在重新定义生成视觉分析中的用户参与度和分析深度方面的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer Graphics Forum 工程技术-计算机：软件工程

CiteScore

5.80

自引率

12.00%

发文量

175

审稿时长

3-6 weeks

期刊介绍： Computer Graphics Forum is the official journal of Eurographics, published in cooperation with Wiley-Blackwell, and is a unique, international source of information for computer graphics professionals interested in graphics developments worldwide. It is now one of the leading journals for researchers, developers and users of computer graphics in both commercial and academic environments. The journal reports on the latest developments in the field throughout the world and covers all aspects of the theory, practice and application of computer graphics.