Yichen Li, Wen-Jie Shen, Boyu Zhang, Feng Mao, Zongzhang Zhang, Yang Yu
{"title":"Learning Generalizable Batch Active Learning Strategies via Deep Q-networks (Student Abstract)","authors":"Yichen Li, Wen-Jie Shen, Boyu Zhang, Feng Mao, Zongzhang Zhang, Yang Yu","doi":"10.1609/aaai.v37i13.26989","DOIUrl":null,"url":null,"abstract":"To handle a large amount of unlabeled data, batch active learning (BAL) queries humans for the labels of a batch of the most valuable data points at every round. Most current BAL strategies are based on human-designed heuristics, such as uncertainty sampling or mutual information maximization. However, there exists a disagreement between these heuristics and the ultimate goal of BAL, i.e., optimizing the model's final performance within the query budgets. This disagreement leads to a limited generality of these heuristics. To this end, we formulate BAL as an MDP and propose a data-driven approach based on deep reinforcement learning. Our method learns the BAL strategy by maximizing the model's final performance. Experiments on the UCI benchmark show that our method can achieve competitive performance compared to existing heuristics-based approaches.","PeriodicalId":74506,"journal":{"name":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","volume":"32 1","pages":"16258-16259"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1609/aaai.v37i13.26989","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
To handle a large amount of unlabeled data, batch active learning (BAL) queries humans for the labels of a batch of the most valuable data points at every round. Most current BAL strategies are based on human-designed heuristics, such as uncertainty sampling or mutual information maximization. However, there exists a disagreement between these heuristics and the ultimate goal of BAL, i.e., optimizing the model's final performance within the query budgets. This disagreement leads to a limited generality of these heuristics. To this end, we formulate BAL as an MDP and propose a data-driven approach based on deep reinforcement learning. Our method learns the BAL strategy by maximizing the model's final performance. Experiments on the UCI benchmark show that our method can achieve competitive performance compared to existing heuristics-based approaches.