Weizhi Liu, Haobin Li, L. Lee, E. P. Chew, Hui Xiao
{"title":"Optimal Computing Budget Allocation for Binary Classification with Noisy Labels and its Applications on Simulation Analytics","authors":"Weizhi Liu, Haobin Li, L. Lee, E. P. Chew, Hui Xiao","doi":"10.1109/WSC40007.2019.9004832","DOIUrl":null,"url":null,"abstract":"In this study, we consider the budget allocation problem for binary classification with noisy labels. The classification accuracy can be improved by reducing the label noises which can be achieved by observing multiple independent observations of the labels. Hence, an efficient budget allocation strategy is needed to reduce the label noise and meanwhile guarantees a promising classification accuracy. Two problem settings are investigated in this work. One assumes that we do not know the underlying classification structures and labels can only be determined by comparing the sample average of its Bernoulli success probability with a given threshold. The other case assumes that data points with different labels can be separated by a hyperplane. For both cases, the closed-form optimal budget allocation strategies are developed. A simulation analytics example is used to demonstrate how the budget is allocated to different scenarios to further improve the learning of optimal decision functions.","PeriodicalId":127025,"journal":{"name":"2019 Winter Simulation Conference (WSC)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Winter Simulation Conference (WSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WSC40007.2019.9004832","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In this study, we consider the budget allocation problem for binary classification with noisy labels. The classification accuracy can be improved by reducing the label noises which can be achieved by observing multiple independent observations of the labels. Hence, an efficient budget allocation strategy is needed to reduce the label noise and meanwhile guarantees a promising classification accuracy. Two problem settings are investigated in this work. One assumes that we do not know the underlying classification structures and labels can only be determined by comparing the sample average of its Bernoulli success probability with a given threshold. The other case assumes that data points with different labels can be separated by a hyperplane. For both cases, the closed-form optimal budget allocation strategies are developed. A simulation analytics example is used to demonstrate how the budget is allocated to different scenarios to further improve the learning of optimal decision functions.