Sami Wood, Erin Lanus, Daniel D. Doyle, Jeremy Ogorzalek, C. Franck, Laura J. Freeman
{"title":"Developing Hierarchies for Image Classification Model Evaluation","authors":"Sami Wood, Erin Lanus, Daniel D. Doyle, Jeremy Ogorzalek, C. Franck, Laura J. Freeman","doi":"10.1109/AI4I51902.2021.00016","DOIUrl":null,"url":null,"abstract":"Classes within computer vision (CV) datasets often exhibit hierarchical structures such as super-subordinate IS-A relations. While some common performance metrics for evaluating CV models such as “top-5 error” ignore hierarchical structure, metrics for hierarchical scoring exist, yet effectiveness for meaningful evaluation is dependent on the ability of the hierarchy to reflect important semantic relationships between classes. Most hierarchical scoring methods reward closeness between prediction and ground truth classes. Such schemes may produce the same score when a child is misclassified as a terrorist as when a car is misclassified as a vehicle or helicopter, ignorant of the different levels of impact of these misclassifications.An approach for developing context-aware hierarchies for use with existing evaluation metrics to reflect the cost of misclassification is needed. The contribution of this paper is to provide a hierarchy construction framework that penalizes misclassifications accordingly given a list of importance ordered categories and a hierarchical scoring method. The framework is demonstrated in a hierarchy selection use case and compared quantitatively against the “top-5 error” metric and a simple super-subordinate relation hierarchical scoring. We qualitatively discuss the efficacy and implications of each approach.","PeriodicalId":114373,"journal":{"name":"2021 4th International Conference on Artificial Intelligence for Industries (AI4I)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 4th International Conference on Artificial Intelligence for Industries (AI4I)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AI4I51902.2021.00016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Classes within computer vision (CV) datasets often exhibit hierarchical structures such as super-subordinate IS-A relations. While some common performance metrics for evaluating CV models such as “top-5 error” ignore hierarchical structure, metrics for hierarchical scoring exist, yet effectiveness for meaningful evaluation is dependent on the ability of the hierarchy to reflect important semantic relationships between classes. Most hierarchical scoring methods reward closeness between prediction and ground truth classes. Such schemes may produce the same score when a child is misclassified as a terrorist as when a car is misclassified as a vehicle or helicopter, ignorant of the different levels of impact of these misclassifications.An approach for developing context-aware hierarchies for use with existing evaluation metrics to reflect the cost of misclassification is needed. The contribution of this paper is to provide a hierarchy construction framework that penalizes misclassifications accordingly given a list of importance ordered categories and a hierarchical scoring method. The framework is demonstrated in a hierarchy selection use case and compared quantitatively against the “top-5 error” metric and a simple super-subordinate relation hierarchical scoring. We qualitatively discuss the efficacy and implications of each approach.