Lilian Berton, Felipe Mitsuishi, Didier Vega Oliveros
{"title":"主动半监督学习分析","authors":"Lilian Berton, Felipe Mitsuishi, Didier Vega Oliveros","doi":"10.1145/3555776.3577621","DOIUrl":null,"url":null,"abstract":"In many real-world applications, labeled instances are costly and infeasible to obtain large training sets. This way, learning strategies that do the most with fewer labels are calling attention, such as semi-supervised learning (SSL) and active learning (AL). Active learning allows querying instance to be labeled in the uncertain region and semi-supervised learning classify with a small set of labeled data. We combine both strategies to investigate how AL improves SSL performance, considering both classification results and computational cost. We present experimental results comparing five AL strategies on seven benchmark datasets encompassing synthetic data, handwritten digit and image recognition, and brain-computing interaction tasks. The best single AL strategy was the ranked batch mode, but it has the highest computational cost. On the other hand, using a consensus committee approach leads to the highest results and low-processing footprints.","PeriodicalId":42971,"journal":{"name":"Applied Computing Review","volume":null,"pages":null},"PeriodicalIF":0.4000,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Analysis of active semi-supervised learning\",\"authors\":\"Lilian Berton, Felipe Mitsuishi, Didier Vega Oliveros\",\"doi\":\"10.1145/3555776.3577621\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In many real-world applications, labeled instances are costly and infeasible to obtain large training sets. This way, learning strategies that do the most with fewer labels are calling attention, such as semi-supervised learning (SSL) and active learning (AL). Active learning allows querying instance to be labeled in the uncertain region and semi-supervised learning classify with a small set of labeled data. We combine both strategies to investigate how AL improves SSL performance, considering both classification results and computational cost. We present experimental results comparing five AL strategies on seven benchmark datasets encompassing synthetic data, handwritten digit and image recognition, and brain-computing interaction tasks. The best single AL strategy was the ranked batch mode, but it has the highest computational cost. On the other hand, using a consensus committee approach leads to the highest results and low-processing footprints.\",\"PeriodicalId\":42971,\"journal\":{\"name\":\"Applied Computing Review\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.4000,\"publicationDate\":\"2023-03-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Computing Review\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3555776.3577621\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Computing Review","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3555776.3577621","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
In many real-world applications, labeled instances are costly and infeasible to obtain large training sets. This way, learning strategies that do the most with fewer labels are calling attention, such as semi-supervised learning (SSL) and active learning (AL). Active learning allows querying instance to be labeled in the uncertain region and semi-supervised learning classify with a small set of labeled data. We combine both strategies to investigate how AL improves SSL performance, considering both classification results and computational cost. We present experimental results comparing five AL strategies on seven benchmark datasets encompassing synthetic data, handwritten digit and image recognition, and brain-computing interaction tasks. The best single AL strategy was the ranked batch mode, but it has the highest computational cost. On the other hand, using a consensus committee approach leads to the highest results and low-processing footprints.