{"title":"Cross entropy and log likelihood ratio cost as performance measures for multi-conclusion categorical outcomes scales.","authors":"Eric M Warren, John C Handley, H David Sheets","doi":"10.1111/1556-4029.15686","DOIUrl":null,"url":null,"abstract":"<p><p>The inconclusive category in forensics reporting is the appropriate response in many cases, but it poses challenges in estimating an \"error rate\". We discuss the use of a class of information-theoretic measures related to cross entropy as an alternative set of metrics that allows for performance evaluation of results presented using multi-category reporting scales. This paper shows how this class of performance metrics, and in particular the log likelihood ratio cost, which is already in use with likelihood ratio forensic reporting methods and in machine learning communities, can be readily adapted for use with the widely used multiple category conclusions scales. Bayesian credible intervals on these metrics can be estimated using numerical methods. The application of these metrics to published test results is shown. It is demonstrated, using these test results, that reducing the number of categories used in a proficiency test from five or six to three increases the cross entropy, indicating that the higher number of categories was justified, as it they increased the level of agreement with ground truth.</p>","PeriodicalId":94080,"journal":{"name":"Journal of forensic sciences","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of forensic sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1111/1556-4029.15686","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The inconclusive category in forensics reporting is the appropriate response in many cases, but it poses challenges in estimating an "error rate". We discuss the use of a class of information-theoretic measures related to cross entropy as an alternative set of metrics that allows for performance evaluation of results presented using multi-category reporting scales. This paper shows how this class of performance metrics, and in particular the log likelihood ratio cost, which is already in use with likelihood ratio forensic reporting methods and in machine learning communities, can be readily adapted for use with the widely used multiple category conclusions scales. Bayesian credible intervals on these metrics can be estimated using numerical methods. The application of these metrics to published test results is shown. It is demonstrated, using these test results, that reducing the number of categories used in a proficiency test from five or six to three increases the cross entropy, indicating that the higher number of categories was justified, as it they increased the level of agreement with ground truth.