半监督回归的安全协同训练

IF 0.8 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Intelligent Data Analysis Pub Date : 2023-05-25 DOI:10.3233/ida-226718

Liyan Liu, P. Huang, Hong Yu, Fan Min

{"title":"半监督回归的安全协同训练","authors":"Liyan Liu, P. Huang, Hong Yu, Fan Min","doi":"10.3233/ida-226718","DOIUrl":null,"url":null,"abstract":"Co-training is a popular semi-supervised learning method. The learners exchange pseudo-labels obtained from different views to reduce the accumulation of errors. One of the key issues is how to ensure the quality of pseudo-labels. However, the pseudo-labels obtained during the co-training process may be inaccurate. In this paper, we propose a safe co-training (SaCo) algorithm for regression with two new characteristics. First, the safe labeling technique obtains pseudo-labels that are certified by both views to ensure their reliability. It differs from popular techniques of using two views to assign pseudo-labels to each other. Second, the label dynamic adjustment strategy updates the previous pseudo-labels to keep them up-to-date. These pseudo-labels are predicted using the augmented training data. Experiments are conducted on twelve datasets commonly used for regression testing. Results show that SaCo is superior to other co-training style regression algorithms and state-of-the-art semi-supervised regression algorithms.","PeriodicalId":50355,"journal":{"name":"Intelligent Data Analysis","volume":"164 S343","pages":"959-975"},"PeriodicalIF":0.8000,"publicationDate":"2023-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Safe co-training for semi-supervised regression\",\"authors\":\"Liyan Liu, P. Huang, Hong Yu, Fan Min\",\"doi\":\"10.3233/ida-226718\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Co-training is a popular semi-supervised learning method. The learners exchange pseudo-labels obtained from different views to reduce the accumulation of errors. One of the key issues is how to ensure the quality of pseudo-labels. However, the pseudo-labels obtained during the co-training process may be inaccurate. In this paper, we propose a safe co-training (SaCo) algorithm for regression with two new characteristics. First, the safe labeling technique obtains pseudo-labels that are certified by both views to ensure their reliability. It differs from popular techniques of using two views to assign pseudo-labels to each other. Second, the label dynamic adjustment strategy updates the previous pseudo-labels to keep them up-to-date. These pseudo-labels are predicted using the augmented training data. Experiments are conducted on twelve datasets commonly used for regression testing. Results show that SaCo is superior to other co-training style regression algorithms and state-of-the-art semi-supervised regression algorithms.\",\"PeriodicalId\":50355,\"journal\":{\"name\":\"Intelligent Data Analysis\",\"volume\":\"164 S343\",\"pages\":\"959-975\"},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2023-05-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Intelligent Data Analysis\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.3233/ida-226718\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Data Analysis","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3233/ida-226718","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

共同训练是一种流行的半监督学习方法。学习者交换从不同角度获得的伪标签，以减少错误的积累。其中一个关键问题是如何保证伪标签的质量。然而，在共同训练过程中获得的伪标签可能是不准确的。本文提出了一种具有两个新特征的安全协同训练(SaCo)回归算法。首先，安全标签技术获得伪标签，伪标签通过两个视图的认证，以确保其可靠性。它不同于使用两个视图相互分配伪标签的流行技术。其次，标签动态调整策略更新以前的伪标签，使它们保持最新状态。这些伪标签是使用增强的训练数据来预测的。在12个常用的回归测试数据集上进行了实验。结果表明，SaCo算法优于其他协同训练式回归算法和最先进的半监督回归算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Safe co-training for semi-supervised regression

Co-training is a popular semi-supervised learning method. The learners exchange pseudo-labels obtained from different views to reduce the accumulation of errors. One of the key issues is how to ensure the quality of pseudo-labels. However, the pseudo-labels obtained during the co-training process may be inaccurate. In this paper, we propose a safe co-training (SaCo) algorithm for regression with two new characteristics. First, the safe labeling technique obtains pseudo-labels that are certified by both views to ensure their reliability. It differs from popular techniques of using two views to assign pseudo-labels to each other. Second, the label dynamic adjustment strategy updates the previous pseudo-labels to keep them up-to-date. These pseudo-labels are predicted using the augmented training data. Experiments are conducted on twelve datasets commonly used for regression testing. Results show that SaCo is superior to other co-training style regression algorithms and state-of-the-art semi-supervised regression algorithms.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Intelligent Data Analysis 工程技术-计算机：人工智能

CiteScore

2.20

自引率

5.90%

发文量

审稿时长

3.3 months

期刊介绍： Intelligent Data Analysis provides a forum for the examination of issues related to the research and applications of Artificial Intelligence techniques in data analysis across a variety of disciplines. These techniques include (but are not limited to): all areas of data visualization, data pre-processing (fusion, editing, transformation, filtering, sampling), data engineering, database mining techniques, tools and applications, use of domain knowledge in data analysis, big data applications, evolutionary algorithms, machine learning, neural nets, fuzzy logic, statistical pattern recognition, knowledge filtering, and post-processing. In particular, papers are preferred that discuss development of new AI related data analysis architectures, methodologies, and techniques and their applications to various domains.