两阶段选择性预测的学习算法

2022 Asia Conference on Algorithms, Computing and Machine Learning (CACML) Pub Date : 2022-03-01 DOI:10.1109/CACML55074.2022.00093

Weicheng Ye, Dangxing Chen, Ilqar Ramazanli

{"title":"两阶段选择性预测的学习算法","authors":"Weicheng Ye, Dangxing Chen, Ilqar Ramazanli","doi":"10.1109/CACML55074.2022.00093","DOIUrl":null,"url":null,"abstract":"Data gathered from real-world applications often suffer from corruption. The low-quality data will hinder the performance of the learning system in terms of classification accuracy, model building time, and interpretability of the classifier. Selective prediction, also known as prediction with a reject option, is to reduce the error rate by abstaining from prediction under uncertainty while keeping coverage as high as possible. Deep Neural Network (DNN) has a high capacity for fitting large-scale data. If DNNs can leverage the trade-off coverage by selective prediction, then the performance can potentially be improved. However, the current DNN embedded with the reject option requires the knowledge of the rejection threshold, and the searching of threshold is inefficient in large-scale applications. Besides, the abstention of prediction on partial datasets increases the model bias and might not be optimal. To resolve these problems, we propose innovative threshold learning algorithms integrated with the selective prediction that can estimate the intrinsic rejection rate of the dataset. Correspondingly, we provide a rigorous framework to generalize the estimation of data corruption rate. To leverage the advantage of multiple learning algorithms, we extend our learning algorithms to a hierarchical two-stage system. Our methods have the advantage of being flexible with any neural network architecture. The empirical results show that our algorithms can achieve state-of-the-art performance in challenging real-world datasets in both classification and regression problems.","PeriodicalId":137505,"journal":{"name":"2022 Asia Conference on Algorithms, Computing and Machine Learning (CACML)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Learning Algorithm in Two-Stage Selective Prediction\",\"authors\":\"Weicheng Ye, Dangxing Chen, Ilqar Ramazanli\",\"doi\":\"10.1109/CACML55074.2022.00093\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data gathered from real-world applications often suffer from corruption. The low-quality data will hinder the performance of the learning system in terms of classification accuracy, model building time, and interpretability of the classifier. Selective prediction, also known as prediction with a reject option, is to reduce the error rate by abstaining from prediction under uncertainty while keeping coverage as high as possible. Deep Neural Network (DNN) has a high capacity for fitting large-scale data. If DNNs can leverage the trade-off coverage by selective prediction, then the performance can potentially be improved. However, the current DNN embedded with the reject option requires the knowledge of the rejection threshold, and the searching of threshold is inefficient in large-scale applications. Besides, the abstention of prediction on partial datasets increases the model bias and might not be optimal. To resolve these problems, we propose innovative threshold learning algorithms integrated with the selective prediction that can estimate the intrinsic rejection rate of the dataset. Correspondingly, we provide a rigorous framework to generalize the estimation of data corruption rate. To leverage the advantage of multiple learning algorithms, we extend our learning algorithms to a hierarchical two-stage system. Our methods have the advantage of being flexible with any neural network architecture. The empirical results show that our algorithms can achieve state-of-the-art performance in challenging real-world datasets in both classification and regression problems.\",\"PeriodicalId\":137505,\"journal\":{\"name\":\"2022 Asia Conference on Algorithms, Computing and Machine Learning (CACML)\",\"volume\":\"52 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 Asia Conference on Algorithms, Computing and Machine Learning (CACML)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CACML55074.2022.00093\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Asia Conference on Algorithms, Computing and Machine Learning (CACML)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CACML55074.2022.00093","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

从实际应用程序中收集的数据经常会受到损坏。低质量的数据将阻碍学习系统在分类精度、模型构建时间和分类器的可解释性方面的性能。选择性预测，也称为带拒绝选项的预测，是在保持尽可能高的覆盖率的同时，通过放弃不确定的预测来降低错误率。深度神经网络(Deep Neural Network, DNN)具有很强的拟合大规模数据的能力。如果dnn可以通过选择性预测来利用权衡覆盖，那么性能可能会得到改善。然而，目前嵌入拒绝选项的深度神经网络需要知道拒绝阈值，在大规模应用中，阈值的搜索效率很低。此外，在部分数据集上省略预测会增加模型偏差，可能不是最优的。为了解决这些问题，我们提出了与选择性预测相结合的创新阈值学习算法，可以估计数据集的内在拒斥率。相应地，我们提供了一个严格的框架来推广数据损坏率的估计。为了利用多种学习算法的优势，我们将学习算法扩展到一个分层的两阶段系统。我们的方法具有适应任何神经网络结构的灵活性。实证结果表明，我们的算法可以在具有挑战性的现实世界数据集的分类和回归问题中达到最先进的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Learning Algorithm in Two-Stage Selective Prediction

Data gathered from real-world applications often suffer from corruption. The low-quality data will hinder the performance of the learning system in terms of classification accuracy, model building time, and interpretability of the classifier. Selective prediction, also known as prediction with a reject option, is to reduce the error rate by abstaining from prediction under uncertainty while keeping coverage as high as possible. Deep Neural Network (DNN) has a high capacity for fitting large-scale data. If DNNs can leverage the trade-off coverage by selective prediction, then the performance can potentially be improved. However, the current DNN embedded with the reject option requires the knowledge of the rejection threshold, and the searching of threshold is inefficient in large-scale applications. Besides, the abstention of prediction on partial datasets increases the model bias and might not be optimal. To resolve these problems, we propose innovative threshold learning algorithms integrated with the selective prediction that can estimate the intrinsic rejection rate of the dataset. Correspondingly, we provide a rigorous framework to generalize the estimation of data corruption rate. To leverage the advantage of multiple learning algorithms, we extend our learning algorithms to a hierarchical two-stage system. Our methods have the advantage of being flexible with any neural network architecture. The empirical results show that our algorithms can achieve state-of-the-art performance in challenging real-world datasets in both classification and regression problems.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 Asia Conference on Algorithms, Computing and Machine Learning (CACML)

自引率

0.00%

发文量