{"title":"基于协议驱动的动态集成提高深度图像分类器的精度和校准","authors":"Pedro Conde;Rui L. Lopes;Cristiano Premebida","doi":"10.1109/OJCS.2024.3519984","DOIUrl":null,"url":null,"abstract":"One of the biggest challenges when considering the applicability of Deep Learning systems to real-world problems is the possibility of failure in \n<italic>critical</i>\n situations. Possible strategies to tackle this problem are two-fold: (i) models need to be highly accurate, consequently reducing this risk of failure; (ii) facing the impossibility of completely eliminating the risk of error, the models should be able to inform the level of uncertainty at the prediction level. As such, state-of-the-art DL models should be \n<italic>accurate</i>\n and also \n<italic>calibrated</i>\n, meaning that each prediction has to codify its confidence/uncertainty in a way that approximates the true likelihood of correctness. Nonetheless, relevant literature shows that improvements in \n<italic>accuracy</i>\n and \n<italic>calibration</i>\n are not usually related. This motivates the development of Agreement-Driven Dynamic Ensemble, a deep ensemble method that - by dynamically combining the advantages of two different ensemble strategies - is capable of achieving the highest possible accuracy values while obtaining also substantial improvements in calibration. The merits of the proposed algorithm are shown through a series of representative experiments, leveraging two different neural network architectures and three different datasets against multiple state-of-the-art baselines.","PeriodicalId":13205,"journal":{"name":"IEEE Open Journal of the Computer Society","volume":"6 ","pages":"164-175"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10806808","citationCount":"0","resultStr":"{\"title\":\"Improving Accuracy and Calibration of Deep Image Classifiers With Agreement-Driven Dynamic Ensemble\",\"authors\":\"Pedro Conde;Rui L. Lopes;Cristiano Premebida\",\"doi\":\"10.1109/OJCS.2024.3519984\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One of the biggest challenges when considering the applicability of Deep Learning systems to real-world problems is the possibility of failure in \\n<italic>critical</i>\\n situations. Possible strategies to tackle this problem are two-fold: (i) models need to be highly accurate, consequently reducing this risk of failure; (ii) facing the impossibility of completely eliminating the risk of error, the models should be able to inform the level of uncertainty at the prediction level. As such, state-of-the-art DL models should be \\n<italic>accurate</i>\\n and also \\n<italic>calibrated</i>\\n, meaning that each prediction has to codify its confidence/uncertainty in a way that approximates the true likelihood of correctness. Nonetheless, relevant literature shows that improvements in \\n<italic>accuracy</i>\\n and \\n<italic>calibration</i>\\n are not usually related. This motivates the development of Agreement-Driven Dynamic Ensemble, a deep ensemble method that - by dynamically combining the advantages of two different ensemble strategies - is capable of achieving the highest possible accuracy values while obtaining also substantial improvements in calibration. The merits of the proposed algorithm are shown through a series of representative experiments, leveraging two different neural network architectures and three different datasets against multiple state-of-the-art baselines.\",\"PeriodicalId\":13205,\"journal\":{\"name\":\"IEEE Open Journal of the Computer Society\",\"volume\":\"6 \",\"pages\":\"164-175\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-12-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10806808\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Open Journal of the Computer Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10806808/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of the Computer Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10806808/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improving Accuracy and Calibration of Deep Image Classifiers With Agreement-Driven Dynamic Ensemble
One of the biggest challenges when considering the applicability of Deep Learning systems to real-world problems is the possibility of failure in
critical
situations. Possible strategies to tackle this problem are two-fold: (i) models need to be highly accurate, consequently reducing this risk of failure; (ii) facing the impossibility of completely eliminating the risk of error, the models should be able to inform the level of uncertainty at the prediction level. As such, state-of-the-art DL models should be
accurate
and also
calibrated
, meaning that each prediction has to codify its confidence/uncertainty in a way that approximates the true likelihood of correctness. Nonetheless, relevant literature shows that improvements in
accuracy
and
calibration
are not usually related. This motivates the development of Agreement-Driven Dynamic Ensemble, a deep ensemble method that - by dynamically combining the advantages of two different ensemble strategies - is capable of achieving the highest possible accuracy values while obtaining also substantial improvements in calibration. The merits of the proposed algorithm are shown through a series of representative experiments, leveraging two different neural network architectures and three different datasets against multiple state-of-the-art baselines.