A. Popowicz, Krystian Radlak, S. Lasota, Karolina Szczepankiewicz, Michal Szczepankiewicz
{"title":"利用改进的置信度学习改进图像数据集中噪声标签的检测","authors":"A. Popowicz, Krystian Radlak, S. Lasota, Karolina Szczepankiewicz, Michal Szczepankiewicz","doi":"10.1109/MMAR55195.2022.9874318","DOIUrl":null,"url":null,"abstract":"The effectiveness of machine learning algorithms, including deep neural networks (DNN) for classifying image data, depends on proper preparation of the training dataset. Erroneously labeled images in the training data will degrade algorithmic efficiency and cause unpredictable model behavior, thus reduce its safety. Verifying labels in the numerous available databases remains a complicated and laborious task. In this article, we present a MultiNET approach that allows for efficient verification of labeled image datasets. We adapt a state-of-the-art technique, namely Confidence Learning, extending its flexibility and improving the effectiveness by combining outcomes from various DNN architectures. Thanks to the proposed modification, it is possible to automatically detect incorrect labels while minimizing the number of false positives, thus making the verification process much less burdensome. The technique may be of use for researchers and software engineers dealing with externally supplied image datasets.","PeriodicalId":169528,"journal":{"name":"2022 26th International Conference on Methods and Models in Automation and Robotics (MMAR)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving the detection of noisy labels in image datasets using modified Confidence Learning\",\"authors\":\"A. Popowicz, Krystian Radlak, S. Lasota, Karolina Szczepankiewicz, Michal Szczepankiewicz\",\"doi\":\"10.1109/MMAR55195.2022.9874318\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The effectiveness of machine learning algorithms, including deep neural networks (DNN) for classifying image data, depends on proper preparation of the training dataset. Erroneously labeled images in the training data will degrade algorithmic efficiency and cause unpredictable model behavior, thus reduce its safety. Verifying labels in the numerous available databases remains a complicated and laborious task. In this article, we present a MultiNET approach that allows for efficient verification of labeled image datasets. We adapt a state-of-the-art technique, namely Confidence Learning, extending its flexibility and improving the effectiveness by combining outcomes from various DNN architectures. Thanks to the proposed modification, it is possible to automatically detect incorrect labels while minimizing the number of false positives, thus making the verification process much less burdensome. The technique may be of use for researchers and software engineers dealing with externally supplied image datasets.\",\"PeriodicalId\":169528,\"journal\":{\"name\":\"2022 26th International Conference on Methods and Models in Automation and Robotics (MMAR)\",\"volume\":\"51 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 26th International Conference on Methods and Models in Automation and Robotics (MMAR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MMAR55195.2022.9874318\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 26th International Conference on Methods and Models in Automation and Robotics (MMAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MMAR55195.2022.9874318","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improving the detection of noisy labels in image datasets using modified Confidence Learning
The effectiveness of machine learning algorithms, including deep neural networks (DNN) for classifying image data, depends on proper preparation of the training dataset. Erroneously labeled images in the training data will degrade algorithmic efficiency and cause unpredictable model behavior, thus reduce its safety. Verifying labels in the numerous available databases remains a complicated and laborious task. In this article, we present a MultiNET approach that allows for efficient verification of labeled image datasets. We adapt a state-of-the-art technique, namely Confidence Learning, extending its flexibility and improving the effectiveness by combining outcomes from various DNN architectures. Thanks to the proposed modification, it is possible to automatically detect incorrect labels while minimizing the number of false positives, thus making the verification process much less burdensome. The technique may be of use for researchers and software engineers dealing with externally supplied image datasets.