PromptNavi:通过交互式提示视觉探索生成文本到图像

IF 2.8 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING
Bofei Huang , Haoran Xie
{"title":"PromptNavi:通过交互式提示视觉探索生成文本到图像","authors":"Bofei Huang ,&nbsp;Haoran Xie","doi":"10.1016/j.cag.2025.104417","DOIUrl":null,"url":null,"abstract":"<div><div>Modern text-to-image generative models can create high-quality and impressive images, but require extensive trial-and-error to interpret user intents. To solve this issue, we propose PromptNavi, a visual exploration interface for node-based prompt composition leveraging large language models to enhance the efficiency of text-to-image generation. In contrast to conventional prompting interfaces, PromptNavi allows users to manipulate and combine visual attributes of target images directly to refine outputs iteratively. Our user study confirmed that the results generated using PromptNavi achieved significant improvements in user usability, reduced cognitive load, and superior image quality rated by independent evaluators. It is verified that users achieved better results with less effort across all measured dimensions, including creativity, atmosphere, coherence, and overall impression. We believe PromptNavi may bridge the gap between user intent and generative AI outputs, advancing human-centered generative AI by making generative models accessible to novices with an enhanced user experience. Source codes are available at: <span><span>https://github.com/BofeiHuang/PromptNavi</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"132 ","pages":"Article 104417"},"PeriodicalIF":2.8000,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PromptNavi: Text-to-image generation through interactive prompt visual exploration\",\"authors\":\"Bofei Huang ,&nbsp;Haoran Xie\",\"doi\":\"10.1016/j.cag.2025.104417\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Modern text-to-image generative models can create high-quality and impressive images, but require extensive trial-and-error to interpret user intents. To solve this issue, we propose PromptNavi, a visual exploration interface for node-based prompt composition leveraging large language models to enhance the efficiency of text-to-image generation. In contrast to conventional prompting interfaces, PromptNavi allows users to manipulate and combine visual attributes of target images directly to refine outputs iteratively. Our user study confirmed that the results generated using PromptNavi achieved significant improvements in user usability, reduced cognitive load, and superior image quality rated by independent evaluators. It is verified that users achieved better results with less effort across all measured dimensions, including creativity, atmosphere, coherence, and overall impression. We believe PromptNavi may bridge the gap between user intent and generative AI outputs, advancing human-centered generative AI by making generative models accessible to novices with an enhanced user experience. Source codes are available at: <span><span>https://github.com/BofeiHuang/PromptNavi</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50628,\"journal\":{\"name\":\"Computers & Graphics-Uk\",\"volume\":\"132 \",\"pages\":\"Article 104417\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-09-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Graphics-Uk\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0097849325002584\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Graphics-Uk","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0097849325002584","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

摘要

现代文本到图像生成模型可以创建高质量和令人印象深刻的图像,但需要大量的试错来解释用户意图。为了解决这个问题,我们提出了PromptNavi,这是一个基于节点的提示组合的可视化探索界面,利用大型语言模型来提高文本到图像生成的效率。与传统的提示界面相比,PromptNavi允许用户直接操作和组合目标图像的视觉属性,以迭代地优化输出。我们的用户研究证实,使用PromptNavi生成的结果在用户可用性方面取得了显着改善,减少了认知负荷,并获得了独立评估者的卓越图像质量评级。经过验证,用户在所有测量维度上都以更少的努力获得了更好的结果,包括创造力、氛围、连贯性和整体印象。我们相信PromptNavi可以弥合用户意图和生成人工智能输出之间的差距,通过使生成模型能够为新手提供增强的用户体验,推进以人为中心的生成人工智能。源代码可在:https://github.com/BofeiHuang/PromptNavi。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

PromptNavi: Text-to-image generation through interactive prompt visual exploration

PromptNavi: Text-to-image generation through interactive prompt visual exploration
Modern text-to-image generative models can create high-quality and impressive images, but require extensive trial-and-error to interpret user intents. To solve this issue, we propose PromptNavi, a visual exploration interface for node-based prompt composition leveraging large language models to enhance the efficiency of text-to-image generation. In contrast to conventional prompting interfaces, PromptNavi allows users to manipulate and combine visual attributes of target images directly to refine outputs iteratively. Our user study confirmed that the results generated using PromptNavi achieved significant improvements in user usability, reduced cognitive load, and superior image quality rated by independent evaluators. It is verified that users achieved better results with less effort across all measured dimensions, including creativity, atmosphere, coherence, and overall impression. We believe PromptNavi may bridge the gap between user intent and generative AI outputs, advancing human-centered generative AI by making generative models accessible to novices with an enhanced user experience. Source codes are available at: https://github.com/BofeiHuang/PromptNavi.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computers & Graphics-Uk
Computers & Graphics-Uk 工程技术-计算机:软件工程
CiteScore
5.30
自引率
12.00%
发文量
173
审稿时长
38 days
期刊介绍: Computers & Graphics is dedicated to disseminate information on research and applications of computer graphics (CG) techniques. The journal encourages articles on: 1. Research and applications of interactive computer graphics. We are particularly interested in novel interaction techniques and applications of CG to problem domains. 2. State-of-the-art papers on late-breaking, cutting-edge research on CG. 3. Information on innovative uses of graphics principles and technologies. 4. Tutorial papers on both teaching CG principles and innovative uses of CG in education.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信