The Comparison of the Effectiveness and Efficiency of Fine-Tuning Models on Stable Diffusion in Creating Concept Art

Abdul Bilal Qowy, Ahmad Nur Ihsan, Sri Hartati
{"title":"The Comparison of the Effectiveness and Efficiency of Fine-Tuning Models on Stable Diffusion in Creating Concept Art","authors":"Abdul Bilal Qowy, Ahmad Nur Ihsan, Sri Hartati","doi":"10.15408/jti.v17i1.37942","DOIUrl":null,"url":null,"abstract":"This research aims to overcome the limitations of the Stable Diffusion model in creating conceptual works of art, focusing on problem identification, research objectives, methodology and research results. Even though Stable Diffusion has been recognized as the best model, especially in the context of creating conceptual artwork, there is still a need to simplify the process of creating concept art and find the most suitable generative model. This research used three methods: Latent Diffusion Model, Dreambooth: fine-tuning Model, and Stable Diffusion. The research results show that the Dreambooth model produces a more real and realistic painting style, while Textual Inversion tends towards a fantasy and cartoonist style. Although the effectiveness of both is relatively high, with minimal differences, the Dreambooth model is proven to be more effective based on the consistency of FID, PSNR, and visual perception scores. The Dreambooth model is more efficient in training time, even though it requires more memory, while the inference time for both is relatively similar. This research makes a significant contribution to the development of artificial intelligence in the creative industries, opens up opportunities to improve the use of generative models in creating conceptual works of art, and can potentially drive positive change in the use of artificial intelligence in the creative industries more broadly. ","PeriodicalId":506287,"journal":{"name":"JURNAL TEKNIK INFORMATIKA","volume":"34 24","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JURNAL TEKNIK INFORMATIKA","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15408/jti.v17i1.37942","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This research aims to overcome the limitations of the Stable Diffusion model in creating conceptual works of art, focusing on problem identification, research objectives, methodology and research results. Even though Stable Diffusion has been recognized as the best model, especially in the context of creating conceptual artwork, there is still a need to simplify the process of creating concept art and find the most suitable generative model. This research used three methods: Latent Diffusion Model, Dreambooth: fine-tuning Model, and Stable Diffusion. The research results show that the Dreambooth model produces a more real and realistic painting style, while Textual Inversion tends towards a fantasy and cartoonist style. Although the effectiveness of both is relatively high, with minimal differences, the Dreambooth model is proven to be more effective based on the consistency of FID, PSNR, and visual perception scores. The Dreambooth model is more efficient in training time, even though it requires more memory, while the inference time for both is relatively similar. This research makes a significant contribution to the development of artificial intelligence in the creative industries, opens up opportunities to improve the use of generative models in creating conceptual works of art, and can potentially drive positive change in the use of artificial intelligence in the creative industries more broadly. 
微调模型在概念艺术创作中稳定扩散的效果和效率比较
本研究旨在克服稳定扩散模型在创作概念艺术作品中的局限性,重点关注问题识别、研究目标、研究方法和研究成果。尽管稳定扩散模型已被公认为最佳模型,尤其是在创作概念艺术作品方面,但仍有必要简化概念艺术创作过程,找到最合适的生成模型。本研究使用了三种方法:潜在扩散模型、Dreambooth:微调模型和稳定扩散模型。研究结果表明,Dreambooth 模型产生的绘画风格更加真实和写实,而 Textual Inversion 则倾向于幻想和卡通风格。虽然两者的效果都比较高,差异很小,但根据 FID、PSNR 和视觉感知分数的一致性,证明梦布模型更有效。Dreambooth 模型虽然需要更多内存,但在训练时间上更为高效,而两者的推理时间则相对接近。这项研究为人工智能在创意产业中的发展做出了重要贡献,为改进生成模型在创作概念艺术作品中的应用提供了机会,并有可能推动人工智能在创意产业中的应用发生更广泛的积极变化。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信