{"title":"A Novel Multimodal Generative Learning Model based on Basic Fuzzy Concepts","authors":"Huankun Sheng, Hongwei Mo, Tengteng Zhang","doi":"10.1007/s12559-024-10336-7","DOIUrl":null,"url":null,"abstract":"<p>Multimodal models are designed to process different types of data within a single generative framework. The prevalent strategy in previous methods involves learning joint representations that are shared across different modalities. These joint representations are typically obtained by concatenating the top of layers of modality-specific networks. Recently, significant advancements have been made in generating images from text and vice versa. Despite these successes, current models often overlook the role of fuzzy concepts, which are crucial given that human cognitive processes inherently involve a high degree of fuzziness. Recognizing and incorporating fuzzy concepts is therefore essential for enhancing the effectiveness of multimodal cognition models. In this paper, a novel framework, named the Fuzzy Concept Learning Model (FCLM), is proposed to process modalities based on fuzzy concepts. The high-level abstractions between different modalities in the FCLM are represented by the ‘fuzzy concept functions.’ After training, the FCLM is capable of generating images from attribute descriptions and inferring the attributes of input images. Additionally, it can formulate fuzzy concepts at various levels of abstraction. Extensive experiments were conducted on the dSprites and 3D Chairs datasets. Both qualitative and quantitative results from these experiments demonstrate the effectiveness and efficiency of the proposed framework. The FCLM integrates the fuzzy cognitive mechanism with the statistical characteristics of the environment. This innovative cognition-inspired framework offers a novel perspective for processing multimodal information.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Computation","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s12559-024-10336-7","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Multimodal models are designed to process different types of data within a single generative framework. The prevalent strategy in previous methods involves learning joint representations that are shared across different modalities. These joint representations are typically obtained by concatenating the top of layers of modality-specific networks. Recently, significant advancements have been made in generating images from text and vice versa. Despite these successes, current models often overlook the role of fuzzy concepts, which are crucial given that human cognitive processes inherently involve a high degree of fuzziness. Recognizing and incorporating fuzzy concepts is therefore essential for enhancing the effectiveness of multimodal cognition models. In this paper, a novel framework, named the Fuzzy Concept Learning Model (FCLM), is proposed to process modalities based on fuzzy concepts. The high-level abstractions between different modalities in the FCLM are represented by the ‘fuzzy concept functions.’ After training, the FCLM is capable of generating images from attribute descriptions and inferring the attributes of input images. Additionally, it can formulate fuzzy concepts at various levels of abstraction. Extensive experiments were conducted on the dSprites and 3D Chairs datasets. Both qualitative and quantitative results from these experiments demonstrate the effectiveness and efficiency of the proposed framework. The FCLM integrates the fuzzy cognitive mechanism with the statistical characteristics of the environment. This innovative cognition-inspired framework offers a novel perspective for processing multimodal information.
期刊介绍:
Cognitive Computation is an international, peer-reviewed, interdisciplinary journal that publishes cutting-edge articles describing original basic and applied work involving biologically-inspired computational accounts of all aspects of natural and artificial cognitive systems. It provides a new platform for the dissemination of research, current practices and future trends in the emerging discipline of cognitive computation that bridges the gap between life sciences, social sciences, engineering, physical and mathematical sciences, and humanities.