Opal: Multimodal Image Generation for News Illustration

Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology Pub Date : 2022-04-19 DOI:10.1145/3526113.3545621

Vivian Liu, Han Qiao, Lydia B. Chilton

引用次数: 40

Abstract

Advances in multimodal AI have presented people with powerful ways to create images from text. Recent work has shown that text-to-image generations are able to represent a broad range of subjects and artistic styles. However, finding the right visual language for text prompts is difficult. In this paper, we address this challenge with Opal, a system that produces text-to-image generations for news illustration. Given an article, Opal guides users through a structured search for visual concepts and provides a pipeline allowing users to generate illustrations based on an article’s tone, keywords, and related artistic styles. Our evaluation shows that Opal efficiently generates diverse sets of news illustrations, visual assets, and concept ideas. Users with Opal generated two times more usable results than users without. We discuss how structured exploration can help users better understand the capabilities of human AI co-creative systems.

查看原文本刊更多论文

Opal:新闻插图的多模态图像生成

多模式人工智能的进步为人们提供了从文本创建图像的强大方法。最近的研究表明，文本到图像的世代能够代表广泛的主题和艺术风格。然而，为文本提示找到合适的视觉语言是很困难的。在本文中，我们用Opal解决了这一挑战，Opal是一个为新闻插图生成文本到图像的系统。给定一篇文章，Opal会引导用户进行视觉概念的结构化搜索，并提供一个管道，允许用户根据文章的语气、关键字和相关的艺术风格生成插图。我们的评估表明，Opal可以有效地生成各种新闻插图、视觉资产和概念想法。使用Opal的用户生成的可用结果是不使用Opal的用户的两倍。我们讨论了结构化探索如何帮助用户更好地理解人类人工智能共同创造系统的能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology

自引率

0.00%

发文量