Assessing accuracy for multi-class classification when subclasses are involved.

IF 1.6 3区 医学 Q3 HEALTH CARE SCIENCES & SERVICES
Nan Nan, Lili Tian
{"title":"Assessing accuracy for multi-class classification when subclasses are involved.","authors":"Nan Nan, Lili Tian","doi":"10.1177/09622802251343600","DOIUrl":null,"url":null,"abstract":"<p><p>Classifications that involve subclasses are common in many applied fields. \"Compound multi-class classification\" refers to the settings which involve three or more main classes and at least one of the main classes has multiple subclasses. In this paper, we propose an accuracy metric proper for \"compound <math><mi>M</mi></math>-class classification,\" namely \"hypervolume under compound <math><mrow><mi>R</mi><mi>O</mi><mi>C</mi></mrow></math> manifold <math><mo>(</mo><mi>H</mi><mi>U</mi><msub><mi>M</mi><mrow><mi>C</mi><mo>,</mo><mi>M</mi></mrow></msub><mo>)</mo></math>.\" The proposed <math><mi>H</mi><mi>U</mi><msub><mi>M</mi><mrow><mi>C</mi><mo>,</mo><mi>M</mi></mrow></msub></math> evaluates the overall accuracy of a biomarker measured on continuous scale correctly identifying <math><mi>M</mi></math> main classes without requiring specification of an ordering in terms of marker values for subclasses relative to each other within each main class. The probabilistic interpretation of <math><mi>H</mi><mi>U</mi><msub><mi>M</mi><mrow><mi>C</mi><mo>,</mo><mi>M</mi></mrow></msub></math> is analytically derived. A network-based computing algorithm which enables efficient computation of the empirical estimate of <math><mi>H</mi><mi>U</mi><msub><mi>M</mi><mrow><mi>C</mi><mo>,</mo><mi>M</mi></mrow></msub></math> is developed. Non-parametric bootstrap percentile confidence intervals of <math><mi>H</mi><mi>U</mi><msub><mi>M</mi><mrow><mi>C</mi><mo>,</mo><mi>M</mi></mrow></msub></math> are assessed through extensive simulation studies. Lastly, a real data example is included to illustrate the usage of our proposed method.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802251343600"},"PeriodicalIF":1.6000,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Methods in Medical Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/09622802251343600","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

Classifications that involve subclasses are common in many applied fields. "Compound multi-class classification" refers to the settings which involve three or more main classes and at least one of the main classes has multiple subclasses. In this paper, we propose an accuracy metric proper for "compound M-class classification," namely "hypervolume under compound ROC manifold (HUMC,M)." The proposed HUMC,M evaluates the overall accuracy of a biomarker measured on continuous scale correctly identifying M main classes without requiring specification of an ordering in terms of marker values for subclasses relative to each other within each main class. The probabilistic interpretation of HUMC,M is analytically derived. A network-based computing algorithm which enables efficient computation of the empirical estimate of HUMC,M is developed. Non-parametric bootstrap percentile confidence intervals of HUMC,M are assessed through extensive simulation studies. Lastly, a real data example is included to illustrate the usage of our proposed method.

当涉及子类时,评估多类分类的准确性。
涉及子类的分类在许多应用领域都很常见。“复合多类分类”是指涉及三个或三个以上主类且其中至少一个主类具有多个子类的设置。在本文中,我们提出了一个适合于“复合M类分类”的精度度量,即“复合ROC流形下的超体积(HUMC,M)”。提议的HUMC,M评估在连续尺度上测量的生物标志物的总体准确性,正确识别M个主要类别,而不需要根据每个主要类别中相对于其他子类的标记值的顺序规范。对HUMC,M的概率解释进行了解析推导。提出了一种基于网络的计算算法,能够有效地计算出HUMC,M的经验估计。通过广泛的模拟研究评估了HUMC,M的非参数自举百分位数置信区间。最后,通过一个实际的数据实例来说明本文方法的应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Statistical Methods in Medical Research
Statistical Methods in Medical Research 医学-数学与计算生物学
CiteScore
4.10
自引率
4.30%
发文量
127
审稿时长
>12 weeks
期刊介绍: Statistical Methods in Medical Research is a peer reviewed scholarly journal and is the leading vehicle for articles in all the main areas of medical statistics and an essential reference for all medical statisticians. This unique journal is devoted solely to statistics and medicine and aims to keep professionals abreast of the many powerful statistical techniques now available to the medical profession. This journal is a member of the Committee on Publication Ethics (COPE)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信