自监督多级生成对抗网络数据输入算法

IF 3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Approximate Reasoning Pub Date : 2025-08-27 DOI:10.1016/j.ijar.2025.109553

Yi Xu , Shujuan Fang , Xuhui Xing

{"title":"自监督多级生成对抗网络数据输入算法","authors":"Yi Xu , Shujuan Fang , Xuhui Xing","doi":"10.1016/j.ijar.2025.109553","DOIUrl":null,"url":null,"abstract":"<div><div>Data missing has always been a challenging problem in machine learning. The Generative Adversarial Imputation Networks (GAIN) have been shown to outperform many existing solutions. However, in GAIN, because missing values lack ground truth as supervision, it is unable to construct reconstruction loss for missing values and can only judge the reasonableness of imputed values based on reconstruction loss of non-missing values and adversarial loss. From the perspective of granular computing, data has levels, and data at different levels of granularity encapsulates different knowledge. Therefore, based on granular computing, this paper proposes a self-supervised multi-level generative adversarial network data imputation algorithm (MGAIN). Firstly, multiple levels of data are constructed using nested feature set sequences. Then, GAIN is used to impute missing values at the coarsest granularity level, and the imputation results of missing values at the coarse granularity level are used as supervision for imputing missing values at the fine granularity level, constructing reconstruction loss for missing values at the fine granularity level. Finally, based on reconstruction loss of missing values, reconstruction loss of non-missing values, and adversarial loss, data at the finer granularity level is imputed. MGAIN imputes missing values level by level from the coarse granularity level to the fine granularity level to obtain more accurate imputation results. Experimental results validate the effectiveness of the proposed method.</div></div>","PeriodicalId":13842,"journal":{"name":"International Journal of Approximate Reasoning","volume":"187 ","pages":"Article 109553"},"PeriodicalIF":3.0000,"publicationDate":"2025-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Self-supervised multi-level generative adversarial network data imputation algorithm\",\"authors\":\"Yi Xu , Shujuan Fang , Xuhui Xing\",\"doi\":\"10.1016/j.ijar.2025.109553\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Data missing has always been a challenging problem in machine learning. The Generative Adversarial Imputation Networks (GAIN) have been shown to outperform many existing solutions. However, in GAIN, because missing values lack ground truth as supervision, it is unable to construct reconstruction loss for missing values and can only judge the reasonableness of imputed values based on reconstruction loss of non-missing values and adversarial loss. From the perspective of granular computing, data has levels, and data at different levels of granularity encapsulates different knowledge. Therefore, based on granular computing, this paper proposes a self-supervised multi-level generative adversarial network data imputation algorithm (MGAIN). Firstly, multiple levels of data are constructed using nested feature set sequences. Then, GAIN is used to impute missing values at the coarsest granularity level, and the imputation results of missing values at the coarse granularity level are used as supervision for imputing missing values at the fine granularity level, constructing reconstruction loss for missing values at the fine granularity level. Finally, based on reconstruction loss of missing values, reconstruction loss of non-missing values, and adversarial loss, data at the finer granularity level is imputed. MGAIN imputes missing values level by level from the coarse granularity level to the fine granularity level to obtain more accurate imputation results. Experimental results validate the effectiveness of the proposed method.</div></div>\",\"PeriodicalId\":13842,\"journal\":{\"name\":\"International Journal of Approximate Reasoning\",\"volume\":\"187 \",\"pages\":\"Article 109553\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-08-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Approximate Reasoning\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0888613X2500194X\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Approximate Reasoning","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0888613X2500194X","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

数据丢失一直是机器学习中的一个难题。生成对抗输入网络（GAIN）已被证明优于许多现有的解决方案。而在GAIN中，由于缺失值缺乏作为监督的基础真值，无法对缺失值构建重构损失，只能根据非缺失值的重构损失和对抗性损失来判断输入值的合理性。从粒度计算的角度来看，数据具有级别，不同粒度级别的数据封装了不同的知识。为此，本文提出了一种基于颗粒计算的自监督多级生成对抗网络数据输入算法（MGAIN）。首先，使用嵌套的特征集序列构建多层数据。然后，利用GAIN在最粗粒度层面进行缺失值的估算，利用粗粒度层面缺失值的估算结果作为细粒度层面缺失值估算的监督，构建细粒度层面缺失值的重构损失。最后，基于缺失值的重建损失、非缺失值的重建损失和对抗损失，估算出更细粒度的数据。MGAIN从粗粒度级到细粒度级逐级进行缺失值的imputation，以获得更准确的imputation结果。实验结果验证了该方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Self-supervised multi-level generative adversarial network data imputation algorithm

Data missing has always been a challenging problem in machine learning. The Generative Adversarial Imputation Networks (GAIN) have been shown to outperform many existing solutions. However, in GAIN, because missing values lack ground truth as supervision, it is unable to construct reconstruction loss for missing values and can only judge the reasonableness of imputed values based on reconstruction loss of non-missing values and adversarial loss. From the perspective of granular computing, data has levels, and data at different levels of granularity encapsulates different knowledge. Therefore, based on granular computing, this paper proposes a self-supervised multi-level generative adversarial network data imputation algorithm (MGAIN). Firstly, multiple levels of data are constructed using nested feature set sequences. Then, GAIN is used to impute missing values at the coarsest granularity level, and the imputation results of missing values at the coarse granularity level are used as supervision for imputing missing values at the fine granularity level, constructing reconstruction loss for missing values at the fine granularity level. Finally, based on reconstruction loss of missing values, reconstruction loss of non-missing values, and adversarial loss, data at the finer granularity level is imputed. MGAIN imputes missing values level by level from the coarse granularity level to the fine granularity level to obtain more accurate imputation results. Experimental results validate the effectiveness of the proposed method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Approximate Reasoning 工程技术-计算机：人工智能

CiteScore

6.90

自引率

12.80%

发文量

170

审稿时长

67 days

期刊介绍： The International Journal of Approximate Reasoning is intended to serve as a forum for the treatment of imprecision and uncertainty in Artificial and Computational Intelligence, covering both the foundations of uncertainty theories, and the design of intelligent systems for scientific and engineering applications. It publishes high-quality research papers describing theoretical developments or innovative applications, as well as review articles on topics of general interest. Relevant topics include, but are not limited to, probabilistic reasoning and Bayesian networks, imprecise probabilities, random sets, belief functions (Dempster-Shafer theory), possibility theory, fuzzy sets, rough sets, decision theory, non-additive measures and integrals, qualitative reasoning about uncertainty, comparative probability orderings, game-theoretic probability, default reasoning, nonstandard logics, argumentation systems, inconsistency tolerant reasoning, elicitation techniques, philosophical foundations and psychological models of uncertain reasoning. Domains of application for uncertain reasoning systems include risk analysis and assessment, information retrieval and database design, information fusion, machine learning, data and web mining, computer vision, image and signal processing, intelligent data analysis, statistics, multi-agent systems, etc.