CRAFT: Cultural Russian-Oriented Dataset Adaptation for Focused Text-to-Image Generation

IF 0.6 4区数学 Q3 MATHEMATICS

Doklady Mathematics Pub Date : 2025-03-22 DOI:10.1134/S1064562424602324

V. A. Vasilev, V. S. Arkhipkin, J. D. Agafonova, T. V. Nikulina, E. O. Mironova, A. A. Shichanina, N. A. Gerasimenko, M. A. Shoytov, D. V. Dimitrov

{"title":"CRAFT: Cultural Russian-Oriented Dataset Adaptation for Focused Text-to-Image Generation","authors":"V. A. Vasilev, V. S. Arkhipkin, J. D. Agafonova, T. V. Nikulina, E. O. Mironova, A. A. Shichanina, N. A. Gerasimenko, M. A. Shoytov, D. V. Dimitrov","doi":"10.1134/S1064562424602324","DOIUrl":null,"url":null,"abstract":"<p>Despite the fact that popular text-to-image generation models cope well with international and general cultural queries, they have a significant knowledge gap regarding individual cultures. This is due to the content of existing large training datasets collected on the Internet, which are predominantly based on Western European or American popular culture. Meanwhile, the lack of cultural adaptation of the model can lead to incorrect results, a decrease in the generation quality, and the spread of stereotypes and offensive content. In an effort to address this issue, we examine the concept of cultural code and recognize the critical importance of its understanding by modern image generation models, an issue that has not been sufficiently addressed in the research community to date. We propose the methodology for collecting and processing the data necessary to form a dataset based on the cultural code, in particular the Russian one. We explore how the collected data affects the quality of generations in the national domain and analyze the effectiveness of our approach using the Kandinsky 3.1 text-to-image model. Human evaluation results demonstrate an increase in the level of awareness of Russian culture in the model.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"110 1 supplement","pages":"S137 - S150"},"PeriodicalIF":0.6000,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Doklady Mathematics","FirstCategoryId":"100","ListUrlMain":"https://link.springer.com/article/10.1134/S1064562424602324","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS","Score":null,"Total":0}

引用次数: 0

Abstract

Despite the fact that popular text-to-image generation models cope well with international and general cultural queries, they have a significant knowledge gap regarding individual cultures. This is due to the content of existing large training datasets collected on the Internet, which are predominantly based on Western European or American popular culture. Meanwhile, the lack of cultural adaptation of the model can lead to incorrect results, a decrease in the generation quality, and the spread of stereotypes and offensive content. In an effort to address this issue, we examine the concept of cultural code and recognize the critical importance of its understanding by modern image generation models, an issue that has not been sufficiently addressed in the research community to date. We propose the methodology for collecting and processing the data necessary to form a dataset based on the cultural code, in particular the Russian one. We explore how the collected data affects the quality of generations in the national domain and analyze the effectiveness of our approach using the Kandinsky 3.1 text-to-image model. Human evaluation results demonstrate an increase in the level of awareness of Russian culture in the model.

Abstract Image

查看原文本刊更多论文

CRAFT：以俄语文化为导向的数据集改编，用于文本到图像的集中生成

尽管流行的文本到图像生成模型可以很好地处理国际和一般文化查询，但它们在个体文化方面存在显著的知识差距。这是由于在互联网上收集的现有大型训练数据集的内容主要基于西欧或美国流行文化。同时，缺乏对模型的文化适应，会导致不正确的结果，导致生成质量下降，以及刻板印象和攻击性内容的传播。为了解决这个问题，我们研究了文化代码的概念，并认识到现代图像生成模型对其理解的重要性，这是一个迄今为止在研究界尚未得到充分解决的问题。我们提出了收集和处理必要数据的方法，以形成基于文化代码的数据集，特别是俄罗斯文化代码。我们探讨了收集的数据如何影响国家领域的世代质量，并使用Kandinsky 3.1文本到图像模型分析了我们方法的有效性。人的评价结果表明，在该模型中，俄罗斯文化的意识水平有所提高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Doklady Mathematics 数学-数学

CiteScore

1.00

自引率

16.70%

发文量

审稿时长

3-6 weeks

期刊介绍： Doklady Mathematics is a journal of the Presidium of the Russian Academy of Sciences. It contains English translations of papers published in Doklady Akademii Nauk (Proceedings of the Russian Academy of Sciences), which was founded in 1933 and is published 36 times a year. Doklady Mathematics includes the materials from the following areas: mathematics, mathematical physics, computer science, control theory, and computers. It publishes brief scientific reports on previously unpublished significant new research in mathematics and its applications. The main contributors to the journal are Members of the RAS, Corresponding Members of the RAS, and scientists from the former Soviet Union and other foreign countries. Among the contributors are the outstanding Russian mathematicians.