{"title":"PromptNavi: Text-to-image generation through interactive prompt visual exploration","authors":"Bofei Huang , Haoran Xie","doi":"10.1016/j.cag.2025.104417","DOIUrl":null,"url":null,"abstract":"<div><div>Modern text-to-image generative models can create high-quality and impressive images, but require extensive trial-and-error to interpret user intents. To solve this issue, we propose PromptNavi, a visual exploration interface for node-based prompt composition leveraging large language models to enhance the efficiency of text-to-image generation. In contrast to conventional prompting interfaces, PromptNavi allows users to manipulate and combine visual attributes of target images directly to refine outputs iteratively. Our user study confirmed that the results generated using PromptNavi achieved significant improvements in user usability, reduced cognitive load, and superior image quality rated by independent evaluators. It is verified that users achieved better results with less effort across all measured dimensions, including creativity, atmosphere, coherence, and overall impression. We believe PromptNavi may bridge the gap between user intent and generative AI outputs, advancing human-centered generative AI by making generative models accessible to novices with an enhanced user experience. Source codes are available at: <span><span>https://github.com/BofeiHuang/PromptNavi</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"132 ","pages":"Article 104417"},"PeriodicalIF":2.8000,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Graphics-Uk","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0097849325002584","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Modern text-to-image generative models can create high-quality and impressive images, but require extensive trial-and-error to interpret user intents. To solve this issue, we propose PromptNavi, a visual exploration interface for node-based prompt composition leveraging large language models to enhance the efficiency of text-to-image generation. In contrast to conventional prompting interfaces, PromptNavi allows users to manipulate and combine visual attributes of target images directly to refine outputs iteratively. Our user study confirmed that the results generated using PromptNavi achieved significant improvements in user usability, reduced cognitive load, and superior image quality rated by independent evaluators. It is verified that users achieved better results with less effort across all measured dimensions, including creativity, atmosphere, coherence, and overall impression. We believe PromptNavi may bridge the gap between user intent and generative AI outputs, advancing human-centered generative AI by making generative models accessible to novices with an enhanced user experience. Source codes are available at: https://github.com/BofeiHuang/PromptNavi.
期刊介绍:
Computers & Graphics is dedicated to disseminate information on research and applications of computer graphics (CG) techniques. The journal encourages articles on:
1. Research and applications of interactive computer graphics. We are particularly interested in novel interaction techniques and applications of CG to problem domains.
2. State-of-the-art papers on late-breaking, cutting-edge research on CG.
3. Information on innovative uses of graphics principles and technologies.
4. Tutorial papers on both teaching CG principles and innovative uses of CG in education.