PromptNavi：通过交互式提示视觉探索生成文本到图像

IF 2.8 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk Pub Date : 2025-09-19 DOI:10.1016/j.cag.2025.104417

Bofei Huang , Haoran Xie

{"title":"PromptNavi：通过交互式提示视觉探索生成文本到图像","authors":"Bofei Huang , Haoran Xie","doi":"10.1016/j.cag.2025.104417","DOIUrl":null,"url":null,"abstract":"<div><div>Modern text-to-image generative models can create high-quality and impressive images, but require extensive trial-and-error to interpret user intents. To solve this issue, we propose PromptNavi, a visual exploration interface for node-based prompt composition leveraging large language models to enhance the efficiency of text-to-image generation. In contrast to conventional prompting interfaces, PromptNavi allows users to manipulate and combine visual attributes of target images directly to refine outputs iteratively. Our user study confirmed that the results generated using PromptNavi achieved significant improvements in user usability, reduced cognitive load, and superior image quality rated by independent evaluators. It is verified that users achieved better results with less effort across all measured dimensions, including creativity, atmosphere, coherence, and overall impression. We believe PromptNavi may bridge the gap between user intent and generative AI outputs, advancing human-centered generative AI by making generative models accessible to novices with an enhanced user experience. Source codes are available at: <span><span>https://github.com/BofeiHuang/PromptNavi</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"132 ","pages":"Article 104417"},"PeriodicalIF":2.8000,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PromptNavi: Text-to-image generation through interactive prompt visual exploration\",\"authors\":\"Bofei Huang , Haoran Xie\",\"doi\":\"10.1016/j.cag.2025.104417\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Modern text-to-image generative models can create high-quality and impressive images, but require extensive trial-and-error to interpret user intents. To solve this issue, we propose PromptNavi, a visual exploration interface for node-based prompt composition leveraging large language models to enhance the efficiency of text-to-image generation. In contrast to conventional prompting interfaces, PromptNavi allows users to manipulate and combine visual attributes of target images directly to refine outputs iteratively. Our user study confirmed that the results generated using PromptNavi achieved significant improvements in user usability, reduced cognitive load, and superior image quality rated by independent evaluators. It is verified that users achieved better results with less effort across all measured dimensions, including creativity, atmosphere, coherence, and overall impression. We believe PromptNavi may bridge the gap between user intent and generative AI outputs, advancing human-centered generative AI by making generative models accessible to novices with an enhanced user experience. Source codes are available at: <span><span>https://github.com/BofeiHuang/PromptNavi</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50628,\"journal\":{\"name\":\"Computers & Graphics-Uk\",\"volume\":\"132 \",\"pages\":\"Article 104417\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-09-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Graphics-Uk\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0097849325002584\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Graphics-Uk","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0097849325002584","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

现代文本到图像生成模型可以创建高质量和令人印象深刻的图像，但需要大量的试错来解释用户意图。为了解决这个问题，我们提出了PromptNavi，这是一个基于节点的提示组合的可视化探索界面，利用大型语言模型来提高文本到图像生成的效率。与传统的提示界面相比，PromptNavi允许用户直接操作和组合目标图像的视觉属性，以迭代地优化输出。我们的用户研究证实，使用PromptNavi生成的结果在用户可用性方面取得了显着改善，减少了认知负荷，并获得了独立评估者的卓越图像质量评级。经过验证，用户在所有测量维度上都以更少的努力获得了更好的结果，包括创造力、氛围、连贯性和整体印象。我们相信PromptNavi可以弥合用户意图和生成人工智能输出之间的差距，通过使生成模型能够为新手提供增强的用户体验，推进以人为中心的生成人工智能。源代码可在：https://github.com/BofeiHuang/PromptNavi。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

PromptNavi: Text-to-image generation through interactive prompt visual exploration

查看原文本刊更多论文

PromptNavi: Text-to-image generation through interactive prompt visual exploration

Modern text-to-image generative models can create high-quality and impressive images, but require extensive trial-and-error to interpret user intents. To solve this issue, we propose PromptNavi, a visual exploration interface for node-based prompt composition leveraging large language models to enhance the efficiency of text-to-image generation. In contrast to conventional prompting interfaces, PromptNavi allows users to manipulate and combine visual attributes of target images directly to refine outputs iteratively. Our user study confirmed that the results generated using PromptNavi achieved significant improvements in user usability, reduced cognitive load, and superior image quality rated by independent evaluators. It is verified that users achieved better results with less effort across all measured dimensions, including creativity, atmosphere, coherence, and overall impression. We believe PromptNavi may bridge the gap between user intent and generative AI outputs, advancing human-centered generative AI by making generative models accessible to novices with an enhanced user experience. Source codes are available at: https://github.com/BofeiHuang/PromptNavi.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computers & Graphics-Uk 工程技术-计算机：软件工程

CiteScore

5.30

自引率

12.00%

发文量

173

审稿时长

38 days

期刊介绍： Computers & Graphics is dedicated to disseminate information on research and applications of computer graphics (CG) techniques. The journal encourages articles on: 1. Research and applications of interactive computer graphics. We are particularly interested in novel interaction techniques and applications of CG to problem domains. 2. State-of-the-art papers on late-breaking, cutting-edge research on CG. 3. Information on innovative uses of graphics principles and technologies. 4. Tutorial papers on both teaching CG principles and innovative uses of CG in education.